* feat: prioritize api_key over tenant_id for more Azure AD token provider (#8318)
* fix: prioritize api_key over tenant_id for Azure AD token provider
* test: Add test for Azure AD token provider in router
* fix: fix linting error
---------
Co-authored-by: you-n-g <you-n-g@users.noreply.github.com>
* feat(batches/): fix batch cost calculation - ensure it's accurate
use the correct cost value - prev. defaulting to non-batch cost
* feat(batch_utils.py): log batch models to spend logs + standard logging payload
makes it easy to understand how cost was calculated
* fix: fix stored payload for test
* test: fix test
* feat(key_management_endpoints.py): adding support for rotating master key
* feat(key_management_endpoints.py): support decryption-re-encryption of models in db, when master key rotated
* fix(user_api_key_auth.py): raise valid token is None error earlier
enables easier debugging with api key hash in error message
* feat(key_management_endpoints.py): rotate any env vars
* fix(key_management_endpoints.py): uncomment check
* fix: fix linting error
* fix(team_endpoints.py): ensure 404 raised when team not found
* fix(key_management_endpoints.py): fix adding tags to key when metadata is empty
* fix(key_management_endpoints.py): refactor set metadata field to use common function across keys + teams
reduces scope for errors + easier testing
* fix: fix linting error
* fix(invoke_handler.py): fix converse streaming - return signature + ensure consistency with anthropic api response
* build(model_prices_and_context_window.json): fix anthropic api claude-3-7 max output tokens
with beta header this is 128k
Resolves https://github.com/BerriAI/litellm/issues/8964
* feat(handler.py): handle new anthropic 'thinking_delta' block on streaming
Fixes https://github.com/BerriAI/litellm/issues/8825
* fix(transformation.py): support a 'format' parameter for image's
allow user to specify mime type
* fix: pass mimetype via 'format' param
* feat(gemini/chat/transformation.py): support 'format' param for gemini
* fix(factory.py): support 'format' param on sync bedrock converse calls
* feat(bedrock/converse_transformation.py): support 'format' param for bedrock async calls
* refactor(factory.py): move to supporting 'format' param in base helper
ensures consistency in param support
* feat(gpt_transformation.py): filter out 'format' param
don't send invalid param to openai
* fix(gpt_transformation.py): fix translation
* fix: fix translation error
* Fix missing signature_delta in thinking blocks when streaming from Claude 3.7 (#8797)
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* test: update test to enforce signature found
* feat(refactor-signature-param-to-be-'signature'-instead-of-'signature_delta'): keeps it in sync with anthropic
* fix: fix linting error
---------
Co-authored-by: Martin Krasser <krasserm@googlemail.com>
* fix(core_helpers.py): handle litellm_metadata instead of 'metadata'
* feat(batches/): ensure batches logs are written to db
makes batches response dict compatible
* fix(cost_calculator.py): handle batch response being a dictionary
* fix(batches/main.py): modify retrieve endpoints to use @client decorator
enables logging to work on retrieve call
* fix(batches/main.py): fix retrieve batch response type to be 'dict' compatible
* fix(spend_tracking_utils.py): send unique uuid for retrieve batch call type
create batch and retrieve batch share the same id
* fix(spend_tracking_utils.py): prevent duplicate retrieve batch calls from being double counted
* refactor(batches/): refactor cost tracking for batches - do it on retrieve, and within the established litellm_logging pipeline
ensures cost is always logged to db
* fix: fix linting errors
* fix: fix linting error
* fix(common_utils.py): handle $id in response schema when calling vertex ai
Fixes issue where `$id` present in response_schema was not accepted by vertex ai
* test(test_vertex.py): add unit test to ensure $id stripped out of vertex schema
* fix(route_llm_request.py): move to using common router, even for client-side credentials
ensures fallbacks / cooldown logic still works
* test(test_route_llm_request.py): add unit test for route request
* feat(router.py): generate unique model id when clientside credential passed in
Prevents cooldowns for api key 1 from impacting api key 2
* test(test_router.py): update testing to ensure original litellm params not mutated
* fix(router.py): upsert clientside call into llm router model list
enables cooldown logic to work accurately
* fix: fix linting error
* test(test_router_utils.py): add direct test for new util on router
* fix(streaming_handler.py): fix deepseek reasoning content streaming
Fixes https://github.com/BerriAI/litellm/issues/8939
* test(test_streaming_handler.py): add unit test to streaming handle 'is_chunk_non_empty' function
ensures 'reasoning_content' is handled correctly
* fix(proxy_server.py): fix setting router redis cache, if cache enabled on litellm_settings
enables configurations like namespace to just work
* fix(redis_cache.py): fix key for async increment, to use the set namespace
prevents collisions if redis instance shared across environments
* fix load tests on litellm release notes
* fix caching on main branch (#8858)
* fix(streaming_handler.py): fix is delta empty check to handle empty str
* fix(streaming_handler.py): fix delta chunk on final response
* [Bug]: Deepseek error on proxy after upgrading to 1.61.13-stable (#8860)
* fix deepseek error
* test_deepseek_provider_async_completion
* fix get_complete_url
* bump: version 1.61.17 → 1.61.18
* bump: version 1.61.18 → 1.61.19
* vertex ai anthropic thinking param support (#8853)
* fix(vertex_llm_base.py): handle credentials passed in as dictionary
* fix(router.py): support vertex credentials as json dict
* test(test_vertex.py): allows easier testing
mock anthropic thinking response for vertex ai
* test(vertex_ai_partner_models/): don't remove "@" from model
breaks anthropic cost calculation
* test: move testing
* fix: fix linting error
* fix: fix linting error
* fix(vertex_ai_partner_models/main.py): split @ for codestral model
* test: fix test
* fix: fix stripping "@" on mistral models
* fix: fix test
* test: fix test
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
* ui fix leftnav, allow internal users to view their own logs
* pass user_id in uiSpendLogs call
* ui filter logs for internal user
* fix internal users page
* ui show correct message when store prompts is disabled
* fix internal user logs
* test_ui_view_spend_logs_with_user_id
* test spend management endpoint
* fix(create_user_button.tsx): allow admin to set models user has access to, on invite
Enables controlling model access on invite
* feat(auth_checks.py): enforce 'no-model-access' special model name on backend
prevent user from calling models if default key has no model access
* fix(chat_ui.tsx): allow user to input custom model
* fix(chat_ui.tsx): pull available models based on models key has access to
* style(create_user_button.tsx): move default model inside 'personal key creation' accordion
* fix(chat_ui.tsx): fix linting error
* test(test_auth_checks.py): add unit-test for special model name
* docs(internal_user_endpoints.py): update docstring
* fix test_moderations_bad_model
* Litellm dev 02 27 2025 p6 (#8891)
* fix(http_parsing_utils.py): orjson can throw errors on some emoji's in text, default to json.loads
* fix(sagemaker/handler.py): support passing model id on async streaming
* fix(litellm_pre_call_utils.py): Fixes https://github.com/BerriAI/litellm/issues/7237
* Fix calling claude via invoke route + response_format support for claude on invoke route (#8908)
* fix(anthropic_claude3_transformation.py): fix amazon anthropic claude 3 tool calling transformation on invoke route
move to using anthropic config as base
* fix(utils.py): expose anthropic config via providerconfigmanager
* fix(llm_http_handler.py): support json mode on async completion calls
* fix(invoke_handler/make_call): support json mode for anthropic called via bedrock invoke
* fix(anthropic/): handle 'response_format: {"type": "text"}` + migrate amazon claude 3 invoke config to inherit from anthropic config
Prevents error when passing in 'response_format: {"type": "text"}
* test: fix test
* fix(utils.py): fix base invoke provider check
* fix(anthropic_claude3_transformation.py): don't pass 'stream' param
* fix: fix linting errors
* fix(converse_transformation.py): handle response_format type=text for converse
* converse_transformation: pass 'description' if set in response_format (#8907)
* test(test_bedrock_completion.py): e2e test ensuring tool description is passed in
* fix(converse_transformation.py): pass description, if set
* fix(transformation.py): Fixes https://github.com/BerriAI/litellm/issues/8767#issuecomment-2689887663
* Fix bedrock passing `response_format: {"type": "text"}` (#8900)
* fix(converse_transformation.py): ignore type: text, value in response_format
no-op for bedrock
* fix(converse_transformation.py): handle adding response format value to tools
* fix(base_invoke_transformation.py): fix 'get_bedrock_invoke_provider' to handle cross-region-inferencing models
* test(test_bedrock_completion.py): add unit testing for bedrock invoke provider logic
* test: update test
* fix(exception_mapping_utils.py): add context window exceeded error handling for databricks provider route
* fix(fireworks_ai/): support passing tools + response_format together
* fix: cleanup
* fix(base_invoke_transformation.py): fix imports
* (Feat) - Show Error Logs on LiteLLM UI (#8904)
* fix test_moderations_bad_model
* use async_post_call_failure_hook
* basic logging errors in DB
* show status on ui
* show status on ui
* ui show request / response side by side
* stash fixes
* working, track raw request
* track error info in metadata
* fix showing error / request / response logs
* show traceback on error viewer
* ui with traceback of error
* fix async_post_call_failure_hook
* fix(http_parsing_utils.py): orjson can throw errors on some emoji's in text, default to json.loads
* test_get_error_information
* fix code quality
* rename proxy track cost callback test
* _should_store_errors_in_spend_logs
* feature flag error logs
* Revert "_should_store_errors_in_spend_logs"
This reverts commit 7f345df477.
* Revert "feature flag error logs"
This reverts commit 0e90c022bb.
* test_spend_logs_payload
* fix OTEL log_db_metrics
* fix import json
* fix ui linting error
* test_async_post_call_failure_hook
* test_chat_completion_bad_model_with_spend_logs
---------
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
* ui new build
* test_chat_completion_bad_model_with_spend_logs
* docs(release_cycle.md): document release cycle
* bump: version 1.62.0 → 1.62.1
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
* fix test_moderations_bad_model
* use async_post_call_failure_hook
* basic logging errors in DB
* show status on ui
* show status on ui
* ui show request / response side by side
* stash fixes
* working, track raw request
* track error info in metadata
* fix showing error / request / response logs
* show traceback on error viewer
* ui with traceback of error
* fix async_post_call_failure_hook
* fix(http_parsing_utils.py): orjson can throw errors on some emoji's in text, default to json.loads
* test_get_error_information
* fix code quality
* rename proxy track cost callback test
* _should_store_errors_in_spend_logs
* feature flag error logs
* Revert "_should_store_errors_in_spend_logs"
This reverts commit 7f345df477.
* Revert "feature flag error logs"
This reverts commit 0e90c022bb.
* test_spend_logs_payload
* fix OTEL log_db_metrics
* fix import json
* fix ui linting error
* test_async_post_call_failure_hook
* test_chat_completion_bad_model_with_spend_logs
---------
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
* fix(anthropic_claude3_transformation.py): fix amazon anthropic claude 3 tool calling transformation on invoke route
move to using anthropic config as base
* fix(utils.py): expose anthropic config via providerconfigmanager
* fix(llm_http_handler.py): support json mode on async completion calls
* fix(invoke_handler/make_call): support json mode for anthropic called via bedrock invoke
* fix(anthropic/): handle 'response_format: {"type": "text"}` + migrate amazon claude 3 invoke config to inherit from anthropic config
Prevents error when passing in 'response_format: {"type": "text"}
* test: fix test
* fix(utils.py): fix base invoke provider check
* fix(anthropic_claude3_transformation.py): don't pass 'stream' param
* fix: fix linting errors
* fix(converse_transformation.py): handle response_format type=text for converse