Commit graph

514 commits

Author SHA1 Message Date
Ishaan Jaff
a1a038aa19 fix typo on codebase 2025-03-27 22:36:00 -07:00
Krish Dholakia
ed8c63b51e Allow team admins to add/update/delete models on UI + show api base and model id on request logs (#9572)
* feat(view_logs.tsx): show model id + api base in request logs

easier debugging

* fix(index.tsx): fix length of api base

easier viewing

* refactor(leftnav.tsx): show models tab to team admin

* feat(model_dashboard.tsx): add explainer for what the 'models' page is for team admin

helps them understand how they can use it

* feat(model_management_endpoints.py): restrict model add by team to just team admin

allow team admin to add models via non-team keys (e.g. ui token)

* test(test_add_update_models.py): update unit testing for new behaviour

* fix(model_dashboard.tsx): show user the models

* feat(proxy_server.py): add new query param 'user_models_only' to `/v2/model/info`

Allows user to retrieve just the models they've added

Used in UI to show internal users just the models they've added

* feat(model_dashboard.tsx): allow team admins to view their own models

* fix: allow ui user to fetch model cost map

* feat(add_model_tab.tsx): require team admins to specify team when onboarding models

* fix(_types.py): add `/v1/model/info` to info route

`/model/info` was already there

* fix(model_info_view.tsx): allow user to edit a model they created

* fix(model_management_endpoints.py): allow team admin to update team model

* feat(model_managament_endpoints.py): allow team admin to delete team models

* fix(model_management_endpoints.py): don't require team id to be set when adding a model

* fix(proxy_server.py): fix linting error

* fix: fix ui linting error

* fix(model_management_endpoints.py): ensure consistent auth checks on all model calls

* test: remove old test - function no longer exists in same form

* test: add updated mock testing
2025-03-27 12:06:31 -07:00
Krish Dholakia
d58fe5a9f9 Add OpenAI gpt-4o-transcribe support (#9517)
* refactor: introduce new transformation config for gpt-4o-transcribe models

* refactor: expose new transformation configs for audio transcription

* ci: fix config yml

* feat(openai/transcriptions): support provider config transformation on openai audio transcriptions

allows gpt-4o and whisper audio transformation to work as expected

* refactor: migrate fireworks ai + deepgram to new transform request pattern

* feat(openai/): working support for gpt-4o-audio-transcribe

* build(model_prices_and_context_window.json): add gpt-4o-transcribe to model cost map

* build(model_prices_and_context_window.json): specify what endpoints are supported for `/audio/transcriptions`

* fix(get_supported_openai_params.py): fix return

* refactor(deepgram/): migrate unit test to deepgram handler

* refactor: cleanup unused imports

* fix(get_supported_openai_params.py): fix linting error

* test: update test
2025-03-26 23:10:25 -07:00
Krrish Dholakia
f2df53771c build(model_prices_and_context_window.json): add gemini multimodal embedding cost 2025-03-26 23:04:24 -07:00
Krish Dholakia
9c083e7d2c Support Gemini audio token cost tracking + fix openai audio input token cost tracking (#9535)
* fix(vertex_and_google_ai_studio_gemini.py): log gemini audio tokens in usage object

enables accurate cost tracking

* refactor(vertex_ai/cost_calculator.py): refactor 128k+ token cost calculation to only run if model info has it

Google has moved away from this for gemini-2.0 models

* refactor(vertex_ai/cost_calculator.py): migrate to usage object for more flexible data passthrough

* fix(llm_cost_calc/utils.py): support audio token cost tracking in generic cost per token

enables vertex ai cost tracking to work with audio tokens

* fix(llm_cost_calc/utils.py): default to total prompt tokens if text tokens field not set

* refactor(llm_cost_calc/utils.py): move openai cost tracking to generic cost per token

more consistent behaviour across providers

* test: add unit test for gemini audio token cost calculation

* ci: bump ci config

* test: fix test
2025-03-26 17:26:25 -07:00
Ishaan Jaff
826deea6fb test_gemini_fine_tuned_model_request_consistency 2025-03-26 14:18:11 -07:00
Ishaan Jaff
13391f9d7f test_gemini_fine_tuned_model_request_consistency 2025-03-26 14:10:32 -07:00
Ishaan Jaff
0767a3cc88 Merge branch 'main' into litellm_fix_vertex_ai_ft_models 2025-03-26 11:11:54 -07:00
Ishaan Jaff
172c9d5689 _is_model_gemini_gemini_spec_model 2025-03-26 10:53:23 -07:00
Ishaan Jaff
519d89593e run ci/cd again 2025-03-26 08:12:51 -07:00
Ishaan Jaff
a47f370b05 test_gemini_fine_tuned_model_request_consistency 2025-03-25 23:54:06 -07:00
Krish Dholakia
1ab5d49132 Support litellm.api_base for vertex_ai + gemini/ across completion, embedding, image_generation (#9516)
* test(tests): add unit testing for litellm_proxy integration

* fix(cost_calculator.py): fix tracking cost in sdk when calling proxy

* fix(main.py): respect litellm.api_base on `vertex_ai/` and `gemini/` routes

* fix(main.py): consistently support custom api base across gemini + vertexai on embedding + completion

* feat(vertex_ai/): test

* fix: fix linting error

* test: set api base as None before starting loadtest
2025-03-25 23:46:20 -07:00
Ishaan Jaff
7f07c162df test_create_delete_assistants 2025-03-25 22:08:06 -07:00
Krrish Dholakia
4a91e2116d test: improve flaky test 2025-03-24 23:15:04 -07:00
Tyler Hutcherson
785f9714ca update redisvl dependency 2025-03-24 08:42:11 -04:00
Ishaan Jaff
8fd564acda add supports_web_search 2025-03-22 13:32:22 -07:00
Ishaan Jaff
70f2582834 search_context_cost_per_query test 2025-03-22 13:08:57 -07:00
Ishaan Jaff
11f764601b add search_context_cost_per_1k_calls to model cost map spec 2025-03-22 12:56:21 -07:00
Krrish Dholakia
b6b6d66882 test: mock sagemaker tests 2025-03-21 16:21:18 -07:00
Krrish Dholakia
f11b57bad4 fix: fix testing 2025-03-20 14:37:58 -07:00
Krish Dholakia
f5f92bf6ae Merge pull request #9366 from JamesGuthrie/jg/vertex-output-dimensionality
fix: VertexAI outputDimensionality configuration
2025-03-20 13:55:33 -07:00
Ishaan Jaff
c15e38a148 Merge branch 'main' into litellm_fix_ssl_verify 2025-03-19 21:03:06 -07:00
James Guthrie
4044bca614 fix: VertexAI outputDimensionality configuration
VertexAI's API documentation [1] is an absolute mess. In it, they
describe the parameter to configure output dimensionality as
`output_dimensionality`. In the API example, they switch to using snake
case `outputDimensionality`, which is the correct variant.

[1]: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api#generative-ai-get-text-embedding-drest
2025-03-19 11:07:36 +01:00
Ishaan Jaff
597afb73a6 Merge pull request #9353 from BerriAI/litellm_arize_dynamic_logging
[Feat] - API - Allow using dynamic Arize AI Spaces on LiteLLM
2025-03-18 23:35:28 -07:00
Krish Dholakia
27ff9d3d30 Merge pull request #9335 from BerriAI/litellm_dev_03_17_2025_p3
Contributor PR: Fix sagemaker too little data for content error
2025-03-18 23:24:07 -07:00
Ishaan Jaff
43eedf26f7 Merge branch 'main' into litellm_arize_dynamic_logging 2025-03-18 22:13:35 -07:00
Ishaan Jaff
5a327da78e Merge pull request #9352 from BerriAI/litellm_arize_mar_18
[Bug Fix] Arize AI Logging Integration with LiteLLM
2025-03-18 22:12:46 -07:00
Ishaan Jaff
ac8618823d test_arize_callback 2025-03-18 20:21:23 -07:00
Ishaan Jaff
95f08cec55 test_arize_callback 2025-03-18 18:49:06 -07:00
Krrish Dholakia
c8eaf8e419 fix(response_metadata.py): log the litellm_model_name
make it easier to track the model sent to the provider
2025-03-18 17:46:33 -07:00
Krrish Dholakia
d526a2f929 test(test_tpm_rpm_routing_v2.py): initial test, for asserting async pre call check works as expected 2025-03-18 17:36:55 -07:00
Ishaan Jaff
268dfa90d7 test_dynamic_azure_params 2025-03-18 17:26:23 -07:00
Krrish Dholakia
3ba0df484e fix(lowest_tpm_rpm_v2.py): fix updating limits 2025-03-18 17:10:17 -07:00
Ishaan Jaff
b095332c20 test_async_dynamic_arize_config 2025-03-18 15:02:22 -07:00
Ishaan Jaff
8a9d05a62f test_arize_set_attributes 2025-03-18 14:31:34 -07:00
Ishaan Jaff
608d69f396 Merge pull request #9338 from nate-mar/arize-integration-fixes
Arize integration Fix
2025-03-18 14:28:25 -07:00
Krrish Dholakia
cf1b5b6590 test(test_get_llm_provider.py): cover scenario where xai not in model name 2025-03-18 11:04:59 -07:00
Krrish Dholakia
e1f0cffae7 test(test_get_llm_provider.py): Minimal repro for https://github.com/BerriAI/litellm/issues/9291 2025-03-18 10:35:50 -07:00
Nate Mar
e29b7b9a27 Update test_arize_ai.py 2025-03-18 04:04:44 -07:00
Nate Mar
4a5af3928a restore previous formatting 2025-03-18 04:02:43 -07:00
Nate Mar
cfbefb643c Fix test and add comments 2025-03-18 03:46:53 -07:00
Nate Mar
da9ef84fa5 revert space_key change and add tests for arize integration 2025-03-18 01:40:10 -07:00
Krrish Dholakia
410fcdf058 test: update testing 2025-03-17 22:34:08 -07:00
Krish Dholakia
58d8d4ca3b Merge pull request #9326 from andjsmi/main
Modify completion handler for SageMaker to use payload from `prepared_request`
2025-03-17 22:16:43 -07:00
Krish Dholakia
8ee4eead7f Merge pull request #9333 from BerriAI/litellm_dev_03_17_2025_p2
fix(ollama/completions/transformation.py): pass prompt, untemplated o…
2025-03-17 21:48:30 -07:00
Nate Mar
0dfc21e80a Fix wrong import and use space_id instead of space_key for Arize integration 2025-03-17 20:37:28 -07:00
Krrish Dholakia
4d56992407 fix(ollama/completions/transformation.py): pass prompt, untemplated on /completions request
Fixes https://github.com/BerriAI/litellm/issues/6900
2025-03-17 18:35:44 -07:00
Krrish Dholakia
301375bf84 fix(streaming_handler.py): support logging complete streaming response on cache hit 2025-03-17 18:10:39 -07:00
Krrish Dholakia
ba6369e359 fix(streaming_handler.py): emit deep copy of completed chunk 2025-03-17 17:26:21 -07:00
Andrew Smith
631406003f Update test_sagemaker.py to match changed parameters 2025-03-18 11:09:52 +11:00