* feat(litellm_pre_call_utils.py): support `x-litellm-tags` request header
allow tag based routing + spend tracking via request headers
* docs(request_headers.md): document new `x-litellm-tags` for tag based routing and spend tracking
* docs(tag_routing.md): add to docs
* fix(utils.py): only pass str values for openai metadata param
* fix(utils.py): drop non-str values for metadata param to openai
preview-feature, otel span was being sent in
* ui - use common team dropdown component
* re-use team component
* rename org field on add model
* handle add model submit
* working view model_id and team_id on root models page
* cleaner
* show all fields
* working model info view
* working team info selector
* clean up team id
* new component for model dashboard
* ui show table with dropdown
* make public model names like email
* revert changes to litellm model name
* fix litellm model name
* ui fix public model
* fix mappings
* fix conditional text input
* fix message
* ui fix bulk add models
* _add_team_model_to_db
* move model mgmt helper funcs
* test_add_team_model_to_db
* ui - display model team model name
* fix add model tab
* fix remove redundant info tab on models page
* dont pass model mappings all the way through
* fix jarring model name when adding team models
* fix edit model button
* delete button on model info
* ui fix model dashboard
* fix DeploymentTypedDict
* _is_model_access_group_for_wildcard_route
* test _get_public_model_name
* ui fix viewing public model name
* fix linting error
* fix linting errors
* fix selectedModel logic
* fix(azure/chat/gpt_transformation.py): add 'prediction' as a support azure param
Closes https://github.com/BerriAI/litellm/issues/8500
* build(model_prices_and_context_window.json): add new 'gemini-2.0-pro-exp-02-05' model
* style: cleanup invalid json trailing commma
* feat(utils.py): support passing 'tokenizer_config' to register_prompt_template
enables passing complete tokenizer config of model to litellm
Allows calling deepseek on bedrock with the correct prompt template
* fix(utils.py): fix register_prompt_template for custom model names
* test(test_prompt_factory.py): fix test
* test(test_completion.py): add e2e test for bedrock invoke deepseek ft model
* feat(base_invoke_transformation.py): support hf_model_name param for bedrock invoke calls
enables proxy admin to set base model for ft bedrock deepseek model
* feat(bedrock/invoke): support deepseek_r1 route for bedrock
makes it easy to apply the right chat template to that call
* feat(constants.py): store deepseek r1 chat template - allow user to get correct response from deepseek r1 without extra work
* test(test_completion.py): add e2e mock test for bedrock deepseek
* docs(bedrock.md): document new deepseek_r1 route for bedrock
allows us to use the right config
* fix(exception_mapping_utils.py): catch read operation timeout
* fix(router.py): add more deployment timeout debug information for timeout errors
help understand why some calls in high-traffic don't respect their model-specific timeouts
* test(test_convert_dict_to_response.py): unit test ensuring empty str is not converted to None
Addresses https://github.com/BerriAI/litellm/issues/8507
* fix(convert_dict_to_response.py): handle empty message str - don't return back as 'None'
Fixes https://github.com/BerriAI/litellm/issues/8507
* test(test_completion.py): add e2e test
* fix(client_initialization_utils.py): handle custom llm provider set with valid value not from model name
* fix(handle_jwt.py): handle groups not existing in jwt token
if user not in group, this won't exist
* fix(handle_jwt.py): add new `enforce_team_based_model_access` flag to jwt auth
allows proxy admin to enforce user can only call model if team has access
* feat(navbar.tsx): expose new dropdown in navbar - allow org admin to create teams within org context
* fix(navbar.tsx): remove non-functional cogicon
* fix(proxy/utils.py): include user-org memberships in `/user/info` response
return orgs user is a member of and the user role within org
* feat(organization_endpoints.py): allow internal user to query `/organizations/list` and get all orgs they belong to
enables org admin to select org they belong to, to create teams
* fix(navbar.tsx): show change in ui when org switcher clicked
* feat(page.tsx): update user role based on org they're in
allows org admin to create teams in the org context
* feat(teams.tsx): working e2e flow for allowing org admin to add new teams
* style(navbar.tsx): clarify switching orgs on UI is in BETA
* fix(organization_endpoints.py): handle getting but not setting members
* test: fix test
* fix(client_initialization_utils.py): revert custom llm provider handling fix - causing unintended issues
* docs(token_auth.md): cleanup docs
* fix(parallel_request_limiter.py): add back parallel request information to max parallel request limiter
Resolves https://github.com/BerriAI/litellm/issues/8392
* test: mark flaky test to handle time based tracking issues
* feat(model_management_endpoints.py): expose new patch `/model/{model_id}/update` endpoint
Allows updating specific values of a model in db - makes it easy for admin to know this by calling it a PA
TCH
* feat(edit_model_modal.tsx): allow user to update llm provider + api key on the ui
* fix: fix linting error
* fix(utils.py): handle key error in msg validation
* Support running Aim Guard during LLM call (#7918)
* support running Aim Guard during LLM call
* Rename header
* adjust docs and fix type annotations
* fix(timeout.md): doc fix for openai example on dynamic timeouts
---------
Co-authored-by: Tomer Bin <117278227+hxtomer@users.noreply.github.com>
* initial transform for invoke
* invoke transform_response
* working - able to make request
* working get_complete_url
* working - invoke now runs on llm_http_handler
* fix unused imports
* track litellm overhead ms
* working stream request
* sign_request transform
* sign_request update
* use has_async_custom_stream_wrapper property
* use get_async_custom_stream_wrapper in base llm http handler
* fix make_call in invoke handler
* fix invoke with streaming get_async_custom_stream_wrapper
* working bedrock async streaming with invoke
* fix make call handler for bedrock
* test_all_model_configs
* fix test_bedrock_custom_prompt_template
* sync streaming for bedrock invoke
* fix _add_stream_param_to_request_body
* test_async_text_completion_bedrock
* fix transform_request
* fix get_supported_openai_params
* fix test supports tool choice
* fix test_supports_tool_choice
* add unit test coverage for bedrock invoke transform
* fix location of transformation files
* update import loc
* fix bedrock invoke unit tests
* fix import for max completion tokens
* test(base_llm_unit_tests.py): add test to ensure drop params is respected
* fix(types/prometheus.py): use typing_extensions for python3.8 compatibility
* build: add cherry picked commits
* fix(vertex_ai/gemini/transformation.py): handle 'http://' image urls
* test: add base test for `http:` url's
* fix(factory.py/get_image_details): follow redirects
allows http calls to work
* fix(codestral/): fix stream chunk parsing on last chunk of stream
* Azure ad token provider (#6917)
* Update azure.py
Added optional parameter azure ad token provider
* Added parameter to main.py
* Found token provider arg location
* Fixed embeddings
* Fixed ad token provider
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* fix: fix linting errors
* fix(main.py): leave out o1 route for azure ad token provider, for now
get v0 out for sync azure gpt route to begin with
* test: skip http:// test for fireworks ai
model does not support it
* refactor: cleanup dead code
* fix: revert http:// url passthrough for gemini
google ai studio raises errors
* test: fix test
---------
Co-authored-by: bahtman <anton@baht.dk>
* fix(o_series_transformation.py): add 'reasoning_effort' as o series model param
Closes https://github.com/BerriAI/litellm/issues/8182
* fix(main.py): ensure `reasoning_effort` is a mapped openai param
* refactor(azure/): rename o1_[x] files to o_series_[x]
* refactor(base_llm_unit_tests.py): refactor testing for o series reasoning effort
* test(test_azure_o_series.py): have azure o series tests correctly inherit from base o series model tests
* feat(base_utils.py): support translating 'developer' role to 'system' role for non-openai providers
Makes it easy to switch from openai to anthropic
* fix: fix linting errors
* fix(base_llm_unit_tests.py): fix test
* fix(main.py): add missing param
* Add O3-Mini for Azure and Remove Vision Support (#8161)
* Azure Released O3-mini at the same time as OAI, so i've added support here. Confirmed to work with Sweden Central.
* [FIX] replace cgi for python 3.13 with email.Message as suggested in PEP 594 (#8160)
* Update model_prices_and_context_window.json (#8120)
codestral2501 pricing on vertex_ai
* Fix/db view names (#8119)
* Fix to case sensitive DB Views name
* Fix to case sensitive DB View names
* Added quotes to check query as well
* Added quotes to create view query
* test: handle server error for flaky test
vertex ai has unstable endpoints
---------
Co-authored-by: Wanis Elabbar <70503629+elabbarw@users.noreply.github.com>
Co-authored-by: Honghua Dong <dhh1995@163.com>
Co-authored-by: superpoussin22 <vincent.nadal@orange.fr>
Co-authored-by: Miguel Armenta <37154380+ma-armenta@users.noreply.github.com>
* Litellm dev 01 29 2025 p4 (#8107)
* fix(key_management_endpoints.py): always get db team
Fixes https://github.com/BerriAI/litellm/issues/7983
* test(test_key_management.py): add unit test enforcing check_db_only is always true on key generate checks
* test: fix test
* test: skip gemini thinking
* Litellm dev 01 29 2025 p3 (#8106)
* fix(__init__.py): reduces size of __init__.py and reduces scope for errors by using correct param
* refactor(__init__.py): refactor init by cleaning up redundant params
* refactor(__init__.py): move more constants into constants.py
cleanup root
* refactor(__init__.py): more cleanup
* feat(__init__.py): expose new 'disable_hf_tokenizer_download' param
enables hf model usage in offline env
* docs(config_settings.md): document new disable_hf_tokenizer_download param
* fix: fix linting error
* fix: fix unsafe comparison
* test: fix test
* docs(public_teams.md): add doc showing how to expose public teams for users to join
* docs: add beta disclaimer on public teams
* test: update tests
* feat(lowest_tpm_rpm_v2.py): fix redis cache check to use >= instead of >
makes it consistent
* test(test_custom_guardrails.py): add more unit testing on default on guardrails
ensure it runs if user sent guardrail list is empty
* docs(quick_start.md): clarify default on guardrails run even if user guardrails list contains other guardrails
* refactor(litellm_logging.py): refactor no-log to helper util
allows for more consistent behavior
* feat(litellm_logging.py): add event hook to verbose logs
* fix(litellm_logging.py): add unit testing to ensure `litellm.disable_no_log_param` is respected
* docs(logging.md): document how to disable 'no-log' param
* test: fix test to handle feb
* test: cleanup old bedrock model
* fix: fix router check
* fix handleSubmit
* update handleAddModelSubmit
* add jest testing for ui
* add step for running ui unit tests
* add validate json step to add model
* ui jest testing fixes
* update package lock
* ci/cd run again
* fix antd import
* run jest tests first
* fix antd install
* fix ui unit tests
* fix unit test ui
* feat(databricks/chat/transformation.py): add tools and 'tool_choice' param support
Closes https://github.com/BerriAI/litellm/issues/7788
* refactor: cleanup redundant file
* test: mark flaky test
* test: mark all parallel request tests as flaky
* docs: cleanup doc
* feat(bedrock/): initial commit adding bedrock/converse_like/<model> route support
allows routing to a converse like endpoint
Resolves https://github.com/BerriAI/litellm/issues/8085
* feat(bedrock/chat/converse_transformation.py): make converse config base config compatible
enables new 'converse_like' route
* feat(converse_transformation.py): enables using the proxy with converse like api endpoint
Resolves https://github.com/BerriAI/litellm/issues/8085
* refactor _add_callbacks_from_db_config
* fix check for _custom_logger_exists_in_litellm_callbacks
* move loc of test utils
* run ci/cd again
* test_add_custom_logger_callback_to_specific_event_with_duplicates_callbacks
* fix _custom_logger_class_exists_in_success_callbacks
* unit testing for test_add_callbacks_from_db_config
* test_custom_logger_exists_in_callbacks_individual_functions
* fix config.yml
* fix test test_stream_chunk_builder_openai_audio_output_usage - use direct dict comparison
* ui 1 - show correct msg on no logs
* fix dup country col
* backend - allow filtering by team_id and api_key
* fix ui_view_spend_logs
* ui update query params
* working team id and key hash filters
* fix filter ref - don't hold on them as they are
* fix _model_custom_llm_provider_matches_wildcard_pattern
* fix test test_stream_chunk_builder_openai_audio_output_usage - use direct dict comparison
* docs(reliability.md): add doc on disabling fallbacks per request
* feat(litellm_pre_call_utils.py): support reading request timeout from request headers - new `x-litellm-timeout` param
Allows setting dynamic model timeouts from vercel's AI sdk
* test(test_proxy_server.py): add simple unit test for reading request timeout
* test(test_fallbacks.py): add e2e test to confirm timeout passed in request headers is correctly read
* feat(main.py): support passing metadata to openai in preview
Resolves https://github.com/BerriAI/litellm/issues/6022#issuecomment-2616119371
* fix(main.py): fix passing openai metadata
* docs(request_headers.md): document new request headers
* build: Merge branch 'main' into litellm_dev_01_27_2025_p3
* test: loosen test
* test(test_completion_cost.py): add unit testing to ensure all bedrock models with region name have cost tracked
* feat: initial script to get bedrock pricing from amazon api
ensures bedrock pricing is accurate
* build(model_prices_and_context_window.json): correct bedrock model prices based on api check
ensures accurate bedrock pricing
* ci(config.yml): add bedrock pricing check to ci/cd
ensures litellm always maintains up-to-date pricing for bedrock models
* ci(config.yml): add beautiful soup to ci/cd
* test: bump groq model
* test: fix test
* refactor(factory.py): refactor async bedrock message transformation to use async get request for image url conversion
improve latency of bedrock call
* test(test_bedrock_completion.py): add unit testing to ensure async image url get called for async bedrock call
* refactor(factory.py): refactor bedrock translation to use BedrockImageProcessor
reduces duplicate code
* fix(factory.py): fix bug not allowing pdf's to be processed
* fix(factory.py): fix bedrock converse document understanding with image url
* docs(bedrock.md): clarify all bedrock document types are supported
* refactor: cleanup redundant test + unused imports
* perf: improve perf with reusable clients
* test: fix test
* fix(utils.py): handle failed hf tokenizer request during calls
prevents proxy from failing due to bad hf tokenizer calls
* fix(utils.py): convert failure callback str to custom logger class
Fixes https://github.com/BerriAI/litellm/issues/8013
* test(test_utils.py): fix test - avoid adding mlflow dep on ci/cd
* fix: add missing env vars to test
* test: cleanup redundant test
* feat(main.py): use asyncio.sleep for mock_Timeout=true on async request
adds unit testing to ensure proxy does not fail if specific Openai requests hang (e.g. recent o1 outage)
* fix(streaming_handler.py): fix deepseek r1 return reasoning content on streaming
Fixes https://github.com/BerriAI/litellm/issues/7942
* Revert "fix(streaming_handler.py): fix deepseek r1 return reasoning content on streaming"
This reverts commit 7a052a64e3.
* fix(deepseek-r-1): return reasoning_content as a top-level param
ensures compatibility with existing tools that use it
* fix: fix linting error
* fix(utils.py): initial commit fixing custom cost tracking
refactors out provider specific model info from `get_model_info` - this was causing custom costs to be registered incorrectly
* fix(utils.py): cleanup `_supports_factory` to check provider info, if model info is None
some providers support features like vision across all models
* fix(utils.py): refactor to use _supports_factory
* test: update testing
* fix: fix linting errors
* test: fix testing
* fix(base_utils.py): supported nested json schema passed in for anthropic calls
* refactor(base_utils.py): refactor ref parsing to prevent infinite loop
* test(test_openai_endpoints.py): refactor anthropic test to use bedrock
* fix(langfuse_prompt_management.py): add unit test for sync langfuse calls
Resolves https://github.com/BerriAI/litellm/issues/7938#issuecomment-2613293757