* feat(litellm_pre_call_utils.py): support `x-litellm-tags` request header
allow tag based routing + spend tracking via request headers
* docs(request_headers.md): document new `x-litellm-tags` for tag based routing and spend tracking
* docs(tag_routing.md): add to docs
* fix(utils.py): only pass str values for openai metadata param
* fix(utils.py): drop non-str values for metadata param to openai
preview-feature, otel span was being sent in
* fix(azure/chat/gpt_transformation.py): add 'prediction' as a support azure param
Closes https://github.com/BerriAI/litellm/issues/8500
* build(model_prices_and_context_window.json): add new 'gemini-2.0-pro-exp-02-05' model
* style: cleanup invalid json trailing commma
* feat(utils.py): support passing 'tokenizer_config' to register_prompt_template
enables passing complete tokenizer config of model to litellm
Allows calling deepseek on bedrock with the correct prompt template
* fix(utils.py): fix register_prompt_template for custom model names
* test(test_prompt_factory.py): fix test
* test(test_completion.py): add e2e test for bedrock invoke deepseek ft model
* feat(base_invoke_transformation.py): support hf_model_name param for bedrock invoke calls
enables proxy admin to set base model for ft bedrock deepseek model
* feat(bedrock/invoke): support deepseek_r1 route for bedrock
makes it easy to apply the right chat template to that call
* feat(constants.py): store deepseek r1 chat template - allow user to get correct response from deepseek r1 without extra work
* test(test_completion.py): add e2e mock test for bedrock deepseek
* docs(bedrock.md): document new deepseek_r1 route for bedrock
allows us to use the right config
* fix(exception_mapping_utils.py): catch read operation timeout
* fix(client_initialization_utils.py): handle custom llm provider set with valid value not from model name
* fix(handle_jwt.py): handle groups not existing in jwt token
if user not in group, this won't exist
* fix(handle_jwt.py): add new `enforce_team_based_model_access` flag to jwt auth
allows proxy admin to enforce user can only call model if team has access
* feat(navbar.tsx): expose new dropdown in navbar - allow org admin to create teams within org context
* fix(navbar.tsx): remove non-functional cogicon
* fix(proxy/utils.py): include user-org memberships in `/user/info` response
return orgs user is a member of and the user role within org
* feat(organization_endpoints.py): allow internal user to query `/organizations/list` and get all orgs they belong to
enables org admin to select org they belong to, to create teams
* fix(navbar.tsx): show change in ui when org switcher clicked
* feat(page.tsx): update user role based on org they're in
allows org admin to create teams in the org context
* feat(teams.tsx): working e2e flow for allowing org admin to add new teams
* style(navbar.tsx): clarify switching orgs on UI is in BETA
* fix(organization_endpoints.py): handle getting but not setting members
* test: fix test
* fix(client_initialization_utils.py): revert custom llm provider handling fix - causing unintended issues
* docs(token_auth.md): cleanup docs
* feat(handle_jwt.py): initial commit to allow scope based model access
* feat(handle_jwt.py): allow model access based on token scopes
allow admin to control model access from IDP
* test(test_jwt.py): add unit testing for scope based model access
* docs(token_auth.md): add scope based model access to docs
* docs(token_auth.md): update docs
* docs(token_auth.md): update docs
* build: add gemini commercial rate limits
* fix: fix linting error
* fix(utils.py): handle key error in msg validation
* Support running Aim Guard during LLM call (#7918)
* support running Aim Guard during LLM call
* Rename header
* adjust docs and fix type annotations
* fix(timeout.md): doc fix for openai example on dynamic timeouts
---------
Co-authored-by: Tomer Bin <117278227+hxtomer@users.noreply.github.com>
* Added a guide for users who want to use LiteLLM with AI/ML.
* Minor changes
* Minor changes
* Fix sidebars.js
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* feat(proxy/_types.py): add new jwt field params
allows users + services to auth into proxy
* feat(handle_jwt.py): allow team role proxy access
allows proxy admin to set allowed team roles
* fix(proxy/_types.py): add 'routes' to role based permissions
allow proxy admin to restrict what routes a team can access easily
* feat(handle_jwt.py): support more flexible role based route access
v2 on role based 'allowed_routes'
* test(test_jwt.py): add unit test for rbac for proxy routes
* feat(handle_jwt.py): ensure cost tracking always works for any jwt request with `enforce_rbac=True`
* docs(token_auth.md): add documentation on controlling model access via OIDC Roles
* test: increase time delay before retrying
* test: handle model overloaded for test
* add assembly ai pass through request
* fix assembly pass through
* fix test_assemblyai_basic_transcribe
* fix assemblyai auth check
* test_assemblyai_transcribe_with_non_admin_key
* working assembly ai test
* working assembly ai proxy route
* use helper func to pass through logging
* clean up logging assembly ai
* test: update test to handle gemini token counter change
* fix(factory.py): fix bedrock http:// handling
* add unit testing for assembly pt handler
* docs assembly ai pass through endpoint
* fix proxy_pass_through_endpoint_tests
* fix standard_passthrough_logging_object
* fix ASSEMBLYAI_API_KEY
* test test_assemblyai_proxy_route_basic_post
* test_assemblyai_proxy_route_get_transcript
* fix is is_assemblyai_route
* test_is_assemblyai_route
---------
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
* test(base_llm_unit_tests.py): add test to ensure drop params is respected
* fix(types/prometheus.py): use typing_extensions for python3.8 compatibility
* build: add cherry picked commits
* Refresh VoyageAI models and prices and context
* Refresh VoyageAI models and prices and context
* Refresh VoyageAI models and prices and context
* Updating the available VoyageAI models in the docs
* fix: support azure o3 model family for fake streaming workaround (#8162)
* fix: support azure o3 model family for fake streaming workaround
* refactor: rename helper to is_o_series_model for clarity
* update function calling parameters for o3 models (#8178)
* refactor(o1_transformation.py): refactor o1 config to be o series config, expand o series model check to o3
ensures max_tokens is correctly translated for o3
* feat(openai/): refactor o1 files to be 'o_series' files
expands naming to cover o3
* fix(azure/chat/o1_handler.py): azure openai is an instance of openai - was causing resets
* test(test_azure_o_series.py): assert stream faked for azure o3 mini
Resolves https://github.com/BerriAI/litellm/pull/8162
* fix(o1_transformation.py): fix o1 transformation logic to handle explicit o1_series routing
* docs(azure.md): update doc with `o_series/` model name
---------
Co-authored-by: byrongrogan <47910641+byrongrogan@users.noreply.github.com>
Co-authored-by: Low Jian Sheng <15527690+lowjiansheng@users.noreply.github.com>
* docs(token_auth.md): clarify title
* refactor(handle_jwt.py): add jwt auth manager + refactor to handle groups
allows user to call model if user belongs to group with model access
* refactor(handle_jwt.py): refactor to first check if service call then check user call
* feat(handle_jwt.py): new `enforce_team_access` param
only allows user to call model if a team they belong to has model access
allows controlling user model access by team
* fix(handle_jwt.py): fix error string, remove unecessary param
* docs(token_auth.md): add controlling model access for jwt tokens via teams to docs
* test: fix tests post refactor
* fix: fix linting errors
* fix: fix linting error
* test: fix import error
* add support for using llama spec with bedrock
* fix get_bedrock_invoke_provider
* add support for using bedrock provider in mappings
* working request
* test_bedrock_custom_deepseek
* test_bedrock_custom_deepseek
* fix _get_model_id_for_llama_like_model
* test_bedrock_custom_deepseek
* doc DeepSeek-R1-Distill-Llama-70B
* test_bedrock_custom_deepseek
* Litellm dev 01 29 2025 p4 (#8107)
* fix(key_management_endpoints.py): always get db team
Fixes https://github.com/BerriAI/litellm/issues/7983
* test(test_key_management.py): add unit test enforcing check_db_only is always true on key generate checks
* test: fix test
* test: skip gemini thinking
* Litellm dev 01 29 2025 p3 (#8106)
* fix(__init__.py): reduces size of __init__.py and reduces scope for errors by using correct param
* refactor(__init__.py): refactor init by cleaning up redundant params
* refactor(__init__.py): move more constants into constants.py
cleanup root
* refactor(__init__.py): more cleanup
* feat(__init__.py): expose new 'disable_hf_tokenizer_download' param
enables hf model usage in offline env
* docs(config_settings.md): document new disable_hf_tokenizer_download param
* fix: fix linting error
* fix: fix unsafe comparison
* test: fix test
* docs(public_teams.md): add doc showing how to expose public teams for users to join
* docs: add beta disclaimer on public teams
* test: update tests
* feat(lowest_tpm_rpm_v2.py): fix redis cache check to use >= instead of >
makes it consistent
* test(test_custom_guardrails.py): add more unit testing on default on guardrails
ensure it runs if user sent guardrail list is empty
* docs(quick_start.md): clarify default on guardrails run even if user guardrails list contains other guardrails
* refactor(litellm_logging.py): refactor no-log to helper util
allows for more consistent behavior
* feat(litellm_logging.py): add event hook to verbose logs
* fix(litellm_logging.py): add unit testing to ensure `litellm.disable_no_log_param` is respected
* docs(logging.md): document how to disable 'no-log' param
* test: fix test to handle feb
* test: cleanup old bedrock model
* fix: fix router check
* docs: cleanup doc
* feat(bedrock/): initial commit adding bedrock/converse_like/<model> route support
allows routing to a converse like endpoint
Resolves https://github.com/BerriAI/litellm/issues/8085
* feat(bedrock/chat/converse_transformation.py): make converse config base config compatible
enables new 'converse_like' route
* feat(converse_transformation.py): enables using the proxy with converse like api endpoint
Resolves https://github.com/BerriAI/litellm/issues/8085
* docs(reliability.md): add doc on disabling fallbacks per request
* feat(litellm_pre_call_utils.py): support reading request timeout from request headers - new `x-litellm-timeout` param
Allows setting dynamic model timeouts from vercel's AI sdk
* test(test_proxy_server.py): add simple unit test for reading request timeout
* test(test_fallbacks.py): add e2e test to confirm timeout passed in request headers is correctly read
* feat(main.py): support passing metadata to openai in preview
Resolves https://github.com/BerriAI/litellm/issues/6022#issuecomment-2616119371
* fix(main.py): fix passing openai metadata
* docs(request_headers.md): document new request headers
* build: Merge branch 'main' into litellm_dev_01_27_2025_p3
* test: loosen test
* refactor(factory.py): refactor async bedrock message transformation to use async get request for image url conversion
improve latency of bedrock call
* test(test_bedrock_completion.py): add unit testing to ensure async image url get called for async bedrock call
* refactor(factory.py): refactor bedrock translation to use BedrockImageProcessor
reduces duplicate code
* fix(factory.py): fix bug not allowing pdf's to be processed
* fix(factory.py): fix bedrock converse document understanding with image url
* docs(bedrock.md): clarify all bedrock document types are supported
* refactor: cleanup redundant test + unused imports
* perf: improve perf with reusable clients
* test: fix test
* feat(handle_jwt.py): initial commit adding custom RBAC support on jwt auth
allows admin to define user role field and allowed roles which map to 'internal_user' on litellm
* fix(auth_checks.py): ensure user allowed to access model, when calling via personal keys
Fixes https://github.com/BerriAI/litellm/issues/8029
* feat(handle_jwt.py): support role based access with model permission control on proxy
Allows admin to just grant users roles on IDP (e.g. Azure AD/Keycloak) and user can immediately start calling models
* docs(rbac): add docs on rbac for model access control
make it clear how admin can use roles to control model access on proxy
* fix: fix linting errors
* test(test_user_api_key_auth.py): add unit testing to ensure rbac role is correctly enforced
* test(test_user_api_key_auth.py): add more testing
* test(test_users.py): add unit testing to ensure user model access is always checked for new keys
Resolves https://github.com/BerriAI/litellm/issues/8029
* test: fix unit test
* fix(dot_notation_indexing.py): fix typing to work with python 3.8