Commit graph

13391 commits

Author SHA1 Message Date
Ishaan Jaff
b3de3216a8 fix supports_response_schema bedrock/anthropic models
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
2025-02-07 19:03:08 -08:00
Krish Dholakia
b5850b6b65
Handle azure deepseek reasoning response (#8288) (#8366)
* Handle azure deepseek reasoning response (#8288)

* Handle deepseek reasoning response

* Add helper method + unit test

* Fix: Follow infinity api url format (#8346)

* Follow infinity api url format

* Update test_infinity.py

* fix(infinity/transformation.py): fix linting error

---------

Co-authored-by: vibhavbhat <vibhavb00@gmail.com>
Co-authored-by: Hao Shan <53949959+haoshan98@users.noreply.github.com>
2025-02-07 17:45:51 -08:00
Krish Dholakia
f651d51f26
Litellm dev 02 07 2025 p2 (#8377)
* fix(caching_routes.py): mask redis password on `/cache/ping` route

* fix(caching_routes.py): fix linting erro

* fix(caching_routes.py): fix linting error on caching routes

* fix: fix test - ignore mask_dict - has a breakpoint

* fix(azure.py): add timeout param + elapsed time in azure timeout error

* fix(http_handler.py): add elapsed time to http timeout request

makes it easier to debug how long request took before failing
2025-02-07 17:30:38 -08:00
Krish Dholakia
dfbbf0bde8
fix: dictionary changed size during iteration error (#8327) (#8341)
Co-authored-by: Joey Feldberg <joeyfeldberg@users.noreply.github.com>
Co-authored-by: Joey Feldberg <12495578+joeyfeldberg@users.noreply.github.com>
2025-02-07 16:20:28 -08:00
Krish Dholakia
5d170162d3
fix(nvidia_nim/embed.py): add 'dimensions' support (#8302)
* fix(nvidia_nim/embed.py): add 'dimensions' support

Fixes https://github.com/BerriAI/litellm/issues/8238

* fix(proxy_Server.py): initialize router redis cache if setup on proxy

Fixes https://github.com/BerriAI/litellm/issues/6602

* test: add unit testing for new helper function
2025-02-07 16:19:32 -08:00
Nikolaiev Dmytro
346d8a9132
Update deepseek API prices for 2025-02-08 (#8363) 2025-02-07 08:25:35 -08:00
Krrish Dholakia
c4cfd5eb1f build(ui): updates
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 15s
2025-02-06 23:25:09 -08:00
Krrish Dholakia
790c6eb02a bump: version 1.60.6 → 1.60.7 2025-02-06 23:24:38 -08:00
Krish Dholakia
6b8b49451f
Fix azure max retries error (#8340)
* fix(azure.py): ensure max_retries=0 is respected

Fixes https://github.com/BerriAI/litellm/issues/6129

* fix(test_openai.py): add unit test to ensure openai sdk calls always respect max_retries = 0

* test(test_azure_openai.py): add unit testing for azure_text/ route

* fix(azure.py): fix passing max retries on streaming

* fix(azure.py): fix azure max retries on async completion + streaming

* fix(completion/handler.py): fix azure text async completion + streaming

* test(test_azure_openai.py): ensure azure openai max retries always respected

* test(test_azure_o_series.py): add testing to ensure max retries always respected

* Added gemini providers for 2.0-flash and 2.0-flash lite (#8321)

* Update model_prices_and_context_window.json

added gemini providers for 2.0-flash and 2.0-flash light

* Update model_prices_and_context_window.json

fixed URL

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* Convert tool use arguments to string before counting tokens (#6989)

In at least some cases the `messages["tool_calls"]["function"]["arguments"]` is a dict, not a string. In order to tokenize it properly it needs to be a string. In the case that it is already a string this is a noop, which is also fine.

* build(model_prices_and_context_window.json): add gemini 2.0 flash lite pricing

* build(model_prices_and_context_window.json): add gemini commercial rate limits

* fix(utils.py): fix linting error

* refactor(utils.py): refactor to maintain function size

---------

Co-authored-by: Bardia Khosravi <bardiakhosravi95@gmail.com>
Co-authored-by: Josh Morrow <josh@jcmorrow.com>
2025-02-06 23:20:48 -08:00
Krish Dholakia
d720744656
Litellm dev 02 06 2025 p3 (#8343)
* feat(handle_jwt.py): initial commit to allow scope based model access

* feat(handle_jwt.py): allow model access based on token scopes

allow admin to control model access from IDP

* test(test_jwt.py): add unit testing for scope based model access

* docs(token_auth.md): add scope based model access to docs

* docs(token_auth.md): update docs

* docs(token_auth.md): update docs

* build: add gemini commercial rate limits

* fix: fix linting error
2025-02-06 23:15:33 -08:00
Krish Dholakia
f87ab251b0
UI Updates (#8345)
* fix(.globals.css): revert .md hard set

caused regression in invitation link display (and possibly other places)

* Fix keys not showing on refresh for internal users  (#8312)

* [Bug] UI: Newly created key does not display on the View Key Page (#8039)

- Fixed issue where all keys appeared blank for admin users.
- Implemented filtering of data via team settings to ensure all keys are displayed correctly.

* Fix:
- Updated the validator to allow model editing when `keyTeam.team_alias === "Default Team"`.
- Ensured other teams still follow the original validation rules.

* - added some classes in global.css
- added text wrap in output of request,response and metadata in index.tsx
- fixed styles of table in table.tsx

* - added full payload when we open single log entry
- added Combined Info Card in index.tsx

* fix: keys not showing on refresh for internal user

* fixed user id passed as null when keyuser is you (#8271)

* fix(user_dashboard.tsx): ensure non admin can't view other keys

---------

Co-authored-by: Taha Ali <123803932+tahaali-dev@users.noreply.github.com>
Co-authored-by: Jaswanth Karani <karani.jaswanth@gmail.com>
2025-02-06 22:41:20 -08:00
Ishaan Jaff
7739be340b fix assembly pass through cost tracking 2025-02-06 21:20:59 -08:00
Ishaan Jaff
778bbcdd9c fix test_get_model_info_gemini 2025-02-06 21:05:47 -08:00
Ishaan Jaff
7706ff1f1e ui new build 2025-02-06 18:31:21 -08:00
Ishaan Jaff
65c91cbbbc
(QA+UI) - e2e flow for adding assembly ai passthrough endpoints (#8337)
* add initial test for assembly ai

* start using PassthroughEndpointRouter

* migrate to lllm passthrough endpoints

* add assembly ai as a known provider

* fix PassthroughEndpointRouter

* fix set_pass_through_credentials

* working EU request to assembly ai pass through endpoint

* add e2e test assembly

* test_assemblyai_routes_with_bad_api_key

* clean up pass through endpoint router

* e2e testing for assembly ai pass through

* test assembly ai e2e testing

* delete assembly ai models

* fix code quality

* ui working assembly ai api base flow

* fix install assembly ai

* update model call details with kwargs for pass through logging

* fix tracking assembly ai model in response

* _handle_assemblyai_passthrough_logging

* fix test_initialize_deployment_for_pass_through_unsupported_provider

* TestPassthroughEndpointRouter

* _get_assembly_transcript

* fix assembly ai pt logging tests

* fix assemblyai_proxy_route

* fix _get_assembly_region_from_url
2025-02-06 18:27:54 -08:00
Ishaan Jaff
d2fec8bf13 databricks/meta-llama-3.3-70b-instruct 2025-02-06 18:21:56 -08:00
Krish Dholakia
f031926b82
fix(utils.py): handle key error in msg validation (#8325)
* fix(utils.py): handle key error in msg validation

* Support running Aim Guard during LLM call (#7918)

* support running Aim Guard during LLM call

* Rename header

* adjust docs and fix type annotations

* fix(timeout.md): doc fix for openai example on dynamic timeouts

---------

Co-authored-by: Tomer Bin <117278227+hxtomer@users.noreply.github.com>
2025-02-06 18:13:46 -08:00
Anton Abilov
fac1d2ccef
Fixed meta llama 3.3 key for Databricks API (#8093)
See correct key reference here: https://docs.databricks.com/en/machine-learning/model-serving/foundation-model-overview.html#pay-per-token
2025-02-06 18:05:49 -08:00
Ishaan Jaff
b535c9bdc0
(Bug Fix - Langfuse) - fix for when model response has choices=[] (#8339)
* refactor _get_langfuse_input_output_content

* test_langfuse_logging_completion_with_malformed_llm_response

* fix _get_langfuse_input_output_content

* fixes for langfuse linting

* unit testing for get chat/text content for langfuse

* fix _should_raise_content_policy_error
2025-02-06 18:02:26 -08:00
Krish Dholakia
bcfa641b81
Add gemini-2.0-flash pricing + model info (#8303)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 15s
* add gemini-2.0-flash-001 (#8289)

* build(model_prices_and_context_window.json): add gemini-2.0-flash-001 to model cost map

Adds new gemini model with token based pricing to model cost map

---------

Co-authored-by: kushagro <kush@orby.ai>
2025-02-05 20:49:26 -08:00
Krish Dholakia
b4e5c0de69
Improve rpm check on keys (#8301)
* fix(parallel_request_limiter.py): initial commit that solves the rpm limit check on keys

Fixes https://github.com/BerriAI/litellm/issues/6938

* fix(parallel_request_limiter.py): simpler approach - just increment RPM in pre call hook instead of on success

* fix(parallel_request_limiter.py): pass testing

* fix: fix linting error

* fix(parallel_request_limiter.py): fix parallel request check for keys
2025-02-05 20:23:08 -08:00
Krish Dholakia
443ae55904
Azure OpenAI improvements - o3 native streaming, improved tool call + response format handling (#8292)
* fix(convert_dict_to_response.py): only convert if response is the response_format tool call passed in

Fixes https://github.com/BerriAI/litellm/issues/8241

* fix(gpt_transformation.py): makes sure response format / tools conversion doesn't remove previous tool calls

* refactor(gpt_transformation.py): refactor out json schema converstion to base config

keeps logic consistent across providers

* fix(o_series_transformation.py): support o3 mini native streaming

Fixes https://github.com/BerriAI/litellm/issues/8274

* fix(gpt_transformation.py): remove unused variables

* test: update test
2025-02-05 19:38:58 -08:00
Ishaan Jaff
03f738eff6 fix test_models_by_provider 2025-02-05 19:01:00 -08:00
Ishaan Jaff
818792228c
(Refactor) - migrate bedrock invoke to BaseLLMHTTPHandler class (#8290)
* initial transform for invoke

* invoke transform_response

* working - able to make request

* working get_complete_url

* working - invoke now runs on llm_http_handler

* fix unused imports

* track litellm overhead ms

* working stream request

* sign_request transform

* sign_request update

* use has_async_custom_stream_wrapper property

* use get_async_custom_stream_wrapper in base llm http handler

* fix make_call in invoke handler

* fix invoke with streaming get_async_custom_stream_wrapper

* working bedrock async streaming with invoke

* fix make call handler for bedrock

* test_all_model_configs

* fix test_bedrock_custom_prompt_template

* sync streaming for bedrock invoke

* fix _add_stream_param_to_request_body

* test_async_text_completion_bedrock

* fix transform_request

* fix get_supported_openai_params

* fix test supports tool choice

* fix test_supports_tool_choice

* add unit test coverage for bedrock invoke transform

* fix location of transformation files

* update import loc

* fix bedrock invoke unit tests

* fix import for max completion tokens
2025-02-05 18:58:55 -08:00
Ishaan Jaff
b76b380bc8 fix add back sambanova/Qwen2.5-72B-Instruct 2025-02-05 18:44:17 -08:00
Ishaan Jaff
ffd890e744
add assembly ai cost tracking (#8298) 2025-02-05 18:43:37 -08:00
Ishaan Jaff
6cef115bb0
(Security fix) - remove code block that inserts master key hash into DB (#8268)
* remove code block upserting master key hash to db

* run test to check if key upserted into db

* run ci/cd again

* litellm_proxy_security_tests

* litellm_proxy_security_tests

* run prisma entrypoint

* ci/cd run again

* fix test master key not in db
2025-02-05 17:25:42 -08:00
Krish Dholakia
8d3a942fbd
Litellm staging (#8270)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 15s
* fix(opik.py): cleanup

* docs(opik_integration.md): cleanup opik integration docs

* fix(redact_messages.py): fix redact messages check header logic

ensures stringified bool value in header is still asserted to true

 allows dynamic message redaction

* feat(redact_messages.py): support `x-litellm-enable-message-redaction` request header

allows dynamic message redaction
2025-02-04 22:35:48 -08:00
Krish Dholakia
3c813b3a87
Fix deepseek calling - refactor to use base_llm_http_handler (#8266)
* refactor(deepseek/): move deepseek to base llm http handler

Fixes https://github.com/BerriAI/litellm/issues/8128#issuecomment-2635430457

* fix(gpt_transformation.py): support stream parsing for gpt-like calls

* test(test_deepseek_completion.py): add async streaming test

* fix(gpt_transformation.py): fix import

* fix(gpt_transformation.py): return full api base and content type
2025-02-04 22:30:00 -08:00
Ishaan Jaff
51b9a02615 run ci/cd again 2025-02-04 22:19:57 -08:00
Krish Dholakia
4e34fc3bf8
[BETA] Support OIDC role based access to proxy (#8260)
* feat(proxy/_types.py): add new jwt field params

allows users + services to auth into proxy

* feat(handle_jwt.py): allow team role proxy access

allows proxy admin to set allowed team roles

* fix(proxy/_types.py): add 'routes' to role based permissions

allow proxy admin to restrict what routes a team can access easily

* feat(handle_jwt.py): support more flexible role based route access

v2 on role based 'allowed_routes'

* test(test_jwt.py): add unit test for rbac for proxy routes

* feat(handle_jwt.py): ensure cost tracking always works for any jwt request with `enforce_rbac=True`

* docs(token_auth.md): add documentation on controlling model access via OIDC Roles

* test: increase time delay before retrying

* test: handle model overloaded for test
2025-02-04 21:59:39 -08:00
Krrish Dholakia
7f06b88192 fix(internal_user_endpoints.py): fix try-except for team not in db 2025-02-04 21:57:43 -08:00
Ishaan Jaff
3a6349d871
(Feat) - Add support for structured output on bedrock/nova models + add util litellm.supports_tool_choice (#8264)
* fix supports_tool_choice

* TestBedrockNovaJson

* use supports_tool_choice

* fix supports_tool_choice

* add supports_tool_choice param

* script to add fields to model cost map

* test_supports_tool_choice

* test_supports_tool_choice

* fix supports tool choice check

* test_supports_tool_choice_simple_tests

* fix supports_tool_choice check

* fix supports_tool_choice bedrock

* test_supports_tool_choice

* test_supports_tool_choice

* fix bedrock/eu-west-3/mistral.mistral-large-2402-v1:0

* ci/cd run again

* test_supports_tool_choice_simple_tests

* TestGoogleAIStudioGemini temp - remove to run ci/cd

* test_aaalangfuse_logging_metadata

* TestGoogleAIStudioGemini

* test_check_provider_match

* remove add param to map
2025-02-04 21:47:16 -08:00
Krrish Dholakia
c743475aba build: Squashed commit of the following:
commit 3e4e2cb20a
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Tue Feb 4 15:10:34 2025 -0800

    fix(proxy_server.py): fix redirect from `/sso/key/callback` to redirect on custom server path

    Fixes https://github.com/BerriAI/litellm/issues/5997
2025-02-04 21:45:33 -08:00
Ishaan Jaff
ab134b8871 ci/cd run again 2025-02-04 21:28:13 -08:00
Ishaan Jaff
b965d9bd9a
Fix passing top_k parameter for Bedrock Anthropic models (#8131) (#8269)
* Fix Bedrock Anthropic topK bug

* Remove extra import

* Add unit test + make tests mocked

* Fix camel case

* Fix tests to remove exception handling

Co-authored-by: vibhavbhat <vibhavb00@gmail.com>
2025-02-04 21:16:21 -08:00
Ishaan Jaff
d367f42887 ui new build 2025-02-04 21:12:39 -08:00
Ishaan Jaff
7e1b79d446
(Bug fix) - Langfuse / Callback settings stored in DB (#8251)
* fix _decrypt_and_set_db_env_variables

* fix proxy config

* test callbacks in DB

* test langfuse callbacks in db

* test_e2e_langfuse_callbacks_in_db

* proxy_store_model_in_db_tests

* fix proxy_store_model_in_db_tests

* proxy_store_model_in_db_tests

* fix store_model_db_config.yaml

* fix check_langfuse_request

* fix test langfuse base url

* ci/cd run again
2025-02-04 21:09:37 -08:00
Ishaan Jaff
1d5370b9e6
(feat) - track org_id in SpendLogs (#8253)
* track org id in spend logs

* read org id from team table

* show user_api_key_org_id in spend logs

* test_spend_logs_payload

* test_spend_logs_with_org_id

* test_spend_logs_with_org_id
2025-02-04 21:08:05 -08:00
Ishaan Jaff
b59b26f797
add supports_tool_choice (#8265)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
2025-02-04 19:45:53 -08:00
Krish Dholakia
8d4ad47ec3
fix(prometheus.py): fix setting key budget metrics (#8234)
* fix(prometheus.py): fix setting key budget metrics

ensures custom metadata works with key budget metric

this is a patch. root cause pr is written in a separate branch

* test: fix test
2025-02-04 19:15:50 -08:00
Steve Farthing
9724ee94df Feedback 2025-02-04 21:11:19 -05:00
Ishaan Jaff
f66029470f add supports_response_schema 2025-02-04 16:59:24 -08:00
Ishaan Jaff
ff48e574d5
fix loosen httpx restriction on pip (#8255) 2025-02-04 16:10:48 -08:00
Krish Dholakia
df93debbc7
Internal User Endpoint - vulnerability fix + response type fix (#8228)
* fix(key_management_endpoints.py): fix vulnerability where a user could update another user's keys

Resolves https://github.com/BerriAI/litellm/issues/8031

* test(key_management_endpoints.py): return consistent 403 forbidden error when modifying key that doesn't belong to user

* fix(internal_user_endpoints.py): return model max budget in internal user create response

Fixes https://github.com/BerriAI/litellm/issues/7047

* test: fix test

* test: update test to handle gemini token counter change

* fix(factory.py): fix bedrock http:// handling

* docs: fix typo in lm_studio.md (#8222)

* test: fix testing

* test: fix test

---------

Co-authored-by: foreign-sub <51928805+foreign-sub@users.noreply.github.com>
2025-02-04 06:41:14 -08:00
Ishaan Jaff
8fd60a420d
(Feat) - New pass through add assembly ai passthrough endpoints (#8220)
* add assembly ai pass through request

* fix assembly pass through

* fix test_assemblyai_basic_transcribe

* fix assemblyai auth check

* test_assemblyai_transcribe_with_non_admin_key

* working assembly ai test

* working assembly ai proxy route

* use helper func to pass through logging

* clean up logging assembly ai

* test: update test to handle gemini token counter change

* fix(factory.py): fix bedrock http:// handling

* add unit testing for assembly pt handler

* docs assembly ai pass through endpoint

* fix proxy_pass_through_endpoint_tests

* fix standard_passthrough_logging_object

* fix ASSEMBLYAI_API_KEY

* test test_assemblyai_proxy_route_basic_post

* test_assemblyai_proxy_route_get_transcript

* fix is is_assemblyai_route

* test_is_assemblyai_route

---------

Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
2025-02-03 21:54:32 -08:00
Krrish Dholakia
5b08289d88 fix(factory.py): fix bedrock http:// handling 2025-02-03 18:15:14 -08:00
Krish Dholakia
c8494abdea
test(base_llm_unit_tests.py): add test to ensure drop params is respe… (#8224)
* test(base_llm_unit_tests.py): add test to ensure drop params is respected

* fix(types/prometheus.py): use typing_extensions for python3.8 compatibility

* build: add cherry picked commits
2025-02-03 16:04:44 -08:00
Ishaan Jaff
ec614be6c4 ui new build
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
2025-02-03 08:23:44 -08:00
Krish Dholakia
e7b81f84de
build: ui updates (#8206) 2025-02-03 07:26:58 -08:00