* test_openai_assistants_e2e_operations
* test openai assistants pass through
* fix GET request on pass through handler
* _make_non_streaming_http_request
* _is_assistants_api_request
* test_openai_assistants_e2e_operations
* test_openai_assistants_e2e_operations
* openai_proxy_route
* docs openai pass through
* docs openai pass through
* docs openai pass through
* test pass through handler
* Potential fix for code scanning alert no. 2240: Incomplete URL substring sanitization
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
---------
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
* feat(bedrock/rerank): infer model region if model given as arn
* test: add unit testing to ensure bedrock region name inferred from arn on rerank
* feat(bedrock/rerank/transformation.py): include search units for bedrock rerank result
Resolves https://github.com/BerriAI/litellm/issues/7258#issuecomment-2671557137
* test(test_bedrock_completion.py): add testing for bedrock cohere rerank
* feat(cost_calculator.py): refactor rerank cost tracking to support bedrock cost tracking
* build(model_prices_and_context_window.json): add amazon.rerank model to model cost map
* fix(cost_calculator.py): bedrock/common_utils.py
get base model from model w/ arn -> handles rerank model
* build(model_prices_and_context_window.json): add bedrock cohere rerank pricing
* feat(bedrock/rerank): migrate bedrock config to basererank config
* Revert "feat(bedrock/rerank): migrate bedrock config to basererank config"
This reverts commit 84fae1f167.
* test: add testing to ensure large doc / queries are correctly counted
* Revert "test: add testing to ensure large doc / queries are correctly counted"
This reverts commit 4337f1657e.
* fix(migrate-jina-ai-to-rerank-config): enables cost tracking
* refactor(jina_ai/): finish migrating jina ai to base rerank config
enables cost tracking
* fix(jina_ai/rerank): e2e jina ai rerank cost tracking
* fix: cleanup dead code
* fix: fix python3.8 compatibility error
* test: fix test
* test: add e2e testing for azure ai rerank
* fix: fix linting error
* test: mark cohere as flaky
* feat(litellm_pre_call_utils.py): support `x-litellm-tags` request header
allow tag based routing + spend tracking via request headers
* docs(request_headers.md): document new `x-litellm-tags` for tag based routing and spend tracking
* docs(tag_routing.md): add to docs
* fix(utils.py): only pass str values for openai metadata param
* fix(utils.py): drop non-str values for metadata param to openai
preview-feature, otel span was being sent in
* fix(azure/chat/gpt_transformation.py): add 'prediction' as a support azure param
Closes https://github.com/BerriAI/litellm/issues/8500
* build(model_prices_and_context_window.json): add new 'gemini-2.0-pro-exp-02-05' model
* style: cleanup invalid json trailing commma
* feat(utils.py): support passing 'tokenizer_config' to register_prompt_template
enables passing complete tokenizer config of model to litellm
Allows calling deepseek on bedrock with the correct prompt template
* fix(utils.py): fix register_prompt_template for custom model names
* test(test_prompt_factory.py): fix test
* test(test_completion.py): add e2e test for bedrock invoke deepseek ft model
* feat(base_invoke_transformation.py): support hf_model_name param for bedrock invoke calls
enables proxy admin to set base model for ft bedrock deepseek model
* feat(bedrock/invoke): support deepseek_r1 route for bedrock
makes it easy to apply the right chat template to that call
* feat(constants.py): store deepseek r1 chat template - allow user to get correct response from deepseek r1 without extra work
* test(test_completion.py): add e2e mock test for bedrock deepseek
* docs(bedrock.md): document new deepseek_r1 route for bedrock
allows us to use the right config
* fix(exception_mapping_utils.py): catch read operation timeout
* fix(client_initialization_utils.py): handle custom llm provider set with valid value not from model name
* fix(handle_jwt.py): handle groups not existing in jwt token
if user not in group, this won't exist
* fix(handle_jwt.py): add new `enforce_team_based_model_access` flag to jwt auth
allows proxy admin to enforce user can only call model if team has access
* feat(navbar.tsx): expose new dropdown in navbar - allow org admin to create teams within org context
* fix(navbar.tsx): remove non-functional cogicon
* fix(proxy/utils.py): include user-org memberships in `/user/info` response
return orgs user is a member of and the user role within org
* feat(organization_endpoints.py): allow internal user to query `/organizations/list` and get all orgs they belong to
enables org admin to select org they belong to, to create teams
* fix(navbar.tsx): show change in ui when org switcher clicked
* feat(page.tsx): update user role based on org they're in
allows org admin to create teams in the org context
* feat(teams.tsx): working e2e flow for allowing org admin to add new teams
* style(navbar.tsx): clarify switching orgs on UI is in BETA
* fix(organization_endpoints.py): handle getting but not setting members
* test: fix test
* fix(client_initialization_utils.py): revert custom llm provider handling fix - causing unintended issues
* docs(token_auth.md): cleanup docs
* feat(handle_jwt.py): initial commit to allow scope based model access
* feat(handle_jwt.py): allow model access based on token scopes
allow admin to control model access from IDP
* test(test_jwt.py): add unit testing for scope based model access
* docs(token_auth.md): add scope based model access to docs
* docs(token_auth.md): update docs
* docs(token_auth.md): update docs
* build: add gemini commercial rate limits
* fix: fix linting error
* fix(utils.py): handle key error in msg validation
* Support running Aim Guard during LLM call (#7918)
* support running Aim Guard during LLM call
* Rename header
* adjust docs and fix type annotations
* fix(timeout.md): doc fix for openai example on dynamic timeouts
---------
Co-authored-by: Tomer Bin <117278227+hxtomer@users.noreply.github.com>
* Added a guide for users who want to use LiteLLM with AI/ML.
* Minor changes
* Minor changes
* Fix sidebars.js
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* feat(proxy/_types.py): add new jwt field params
allows users + services to auth into proxy
* feat(handle_jwt.py): allow team role proxy access
allows proxy admin to set allowed team roles
* fix(proxy/_types.py): add 'routes' to role based permissions
allow proxy admin to restrict what routes a team can access easily
* feat(handle_jwt.py): support more flexible role based route access
v2 on role based 'allowed_routes'
* test(test_jwt.py): add unit test for rbac for proxy routes
* feat(handle_jwt.py): ensure cost tracking always works for any jwt request with `enforce_rbac=True`
* docs(token_auth.md): add documentation on controlling model access via OIDC Roles
* test: increase time delay before retrying
* test: handle model overloaded for test
* add assembly ai pass through request
* fix assembly pass through
* fix test_assemblyai_basic_transcribe
* fix assemblyai auth check
* test_assemblyai_transcribe_with_non_admin_key
* working assembly ai test
* working assembly ai proxy route
* use helper func to pass through logging
* clean up logging assembly ai
* test: update test to handle gemini token counter change
* fix(factory.py): fix bedrock http:// handling
* add unit testing for assembly pt handler
* docs assembly ai pass through endpoint
* fix proxy_pass_through_endpoint_tests
* fix standard_passthrough_logging_object
* fix ASSEMBLYAI_API_KEY
* test test_assemblyai_proxy_route_basic_post
* test_assemblyai_proxy_route_get_transcript
* fix is is_assemblyai_route
* test_is_assemblyai_route
---------
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
* test(base_llm_unit_tests.py): add test to ensure drop params is respected
* fix(types/prometheus.py): use typing_extensions for python3.8 compatibility
* build: add cherry picked commits
* Refresh VoyageAI models and prices and context
* Refresh VoyageAI models and prices and context
* Refresh VoyageAI models and prices and context
* Updating the available VoyageAI models in the docs
* fix: support azure o3 model family for fake streaming workaround (#8162)
* fix: support azure o3 model family for fake streaming workaround
* refactor: rename helper to is_o_series_model for clarity
* update function calling parameters for o3 models (#8178)
* refactor(o1_transformation.py): refactor o1 config to be o series config, expand o series model check to o3
ensures max_tokens is correctly translated for o3
* feat(openai/): refactor o1 files to be 'o_series' files
expands naming to cover o3
* fix(azure/chat/o1_handler.py): azure openai is an instance of openai - was causing resets
* test(test_azure_o_series.py): assert stream faked for azure o3 mini
Resolves https://github.com/BerriAI/litellm/pull/8162
* fix(o1_transformation.py): fix o1 transformation logic to handle explicit o1_series routing
* docs(azure.md): update doc with `o_series/` model name
---------
Co-authored-by: byrongrogan <47910641+byrongrogan@users.noreply.github.com>
Co-authored-by: Low Jian Sheng <15527690+lowjiansheng@users.noreply.github.com>
* docs(token_auth.md): clarify title
* refactor(handle_jwt.py): add jwt auth manager + refactor to handle groups
allows user to call model if user belongs to group with model access
* refactor(handle_jwt.py): refactor to first check if service call then check user call
* feat(handle_jwt.py): new `enforce_team_access` param
only allows user to call model if a team they belong to has model access
allows controlling user model access by team
* fix(handle_jwt.py): fix error string, remove unecessary param
* docs(token_auth.md): add controlling model access for jwt tokens via teams to docs
* test: fix tests post refactor
* fix: fix linting errors
* fix: fix linting error
* test: fix import error
* add support for using llama spec with bedrock
* fix get_bedrock_invoke_provider
* add support for using bedrock provider in mappings
* working request
* test_bedrock_custom_deepseek
* test_bedrock_custom_deepseek
* fix _get_model_id_for_llama_like_model
* test_bedrock_custom_deepseek
* doc DeepSeek-R1-Distill-Llama-70B
* test_bedrock_custom_deepseek