* fix(model_checks.py): update returning known model from wildcard to filter based on given model prefix
ensures wildcard route - `vertex_ai/gemini-*` just returns known vertex_ai/gemini- models
* test(test_proxy_utils.py): add unit testing for new 'get_known_models_from_wildcard' helper
* test(test_models.py): add e2e testing for `/model_group/info` endpoint
* feat(prometheus.py): support tracking total requests by user_email on prometheus
adds initial support for tracking total requests by user_email
* test(test_prometheus.py): add testing to ensure user email is always tracked
* test: update testing for new prometheus metric
* test(test_prometheus_unit_tests.py): add user email to total proxy metric
* test: update tests
* test: fix spend tests
* test: fix test
* fix(pagerduty.py): fix linting error
* update team info endpoint
* clean up model alias
* fix model alias
* fix model alias card
* clean up naming on docs
* fix model alias card
* fix _model_in_team_aliases
* team alias - fix litellm.model_alias_map
* fix _update_model_if_team_alias_exists
* fix test_aview_spend_per_user
* Test model alias functionality with teams:
* complete e2e test
* test_update_model_if_team_alias_exists
* update team info endpoint
* clean up model alias
* fix model alias
* fix model alias card
* clean up naming on docs
* fix model alias card
* fix _model_in_team_aliases
* fix key_model_access_denied
* test_can_key_call_model_with_aliases
* fix test_aview_spend_per_user
* fix(litellm_logging.py): support saving applied guardrails in logging object
allows list of applied guardrails to be logged for proxy admin's knowledge
* feat(spend_tracking_utils.py): log applied guardrails to spend logs
makes it easy for admin to know what guardrails were applied on a request
* ci(config.yml): uninstall posthog from ci/cd
* test: fix tests
* test: update test
* Fixed issue #8246 (#8250)
* Fixed issue #8246
* Added unit tests for discard() and for remove_callback_from_list_by_object()
* fix(openai.py): support dynamic passing of organization param to openai
handles scenario where client-side org id is passed to openai
---------
Co-authored-by: Erez Hadad <erezh@il.ibm.com>
* fix(client_initialization_utils.py): handle custom llm provider set with valid value not from model name
* fix(handle_jwt.py): handle groups not existing in jwt token
if user not in group, this won't exist
* fix(handle_jwt.py): add new `enforce_team_based_model_access` flag to jwt auth
allows proxy admin to enforce user can only call model if team has access
* feat(navbar.tsx): expose new dropdown in navbar - allow org admin to create teams within org context
* fix(navbar.tsx): remove non-functional cogicon
* fix(proxy/utils.py): include user-org memberships in `/user/info` response
return orgs user is a member of and the user role within org
* feat(organization_endpoints.py): allow internal user to query `/organizations/list` and get all orgs they belong to
enables org admin to select org they belong to, to create teams
* fix(navbar.tsx): show change in ui when org switcher clicked
* feat(page.tsx): update user role based on org they're in
allows org admin to create teams in the org context
* feat(teams.tsx): working e2e flow for allowing org admin to add new teams
* style(navbar.tsx): clarify switching orgs on UI is in BETA
* fix(organization_endpoints.py): handle getting but not setting members
* test: fix test
* fix(client_initialization_utils.py): revert custom llm provider handling fix - causing unintended issues
* docs(token_auth.md): cleanup docs
* fix(parallel_request_limiter.py): add back parallel request information to max parallel request limiter
Resolves https://github.com/BerriAI/litellm/issues/8392
* test: mark flaky test to handle time based tracking issues
* feat(model_management_endpoints.py): expose new patch `/model/{model_id}/update` endpoint
Allows updating specific values of a model in db - makes it easy for admin to know this by calling it a PA
TCH
* feat(edit_model_modal.tsx): allow user to update llm provider + api key on the ui
* fix: fix linting error
* add back streaming for base o3 (#8361)
* test(base_llm_unit_tests.py): add base test for o-series models - ensure streaming always works
* fix(base_llm_unit_tests.py): fix test for o series models
* refactor: move test
---------
Co-authored-by: Matteo Boschini <12133566+mbosc@users.noreply.github.com>
* fix(caching_routes.py): mask redis password on `/cache/ping` route
* fix(caching_routes.py): fix linting erro
* fix(caching_routes.py): fix linting error on caching routes
* fix: fix test - ignore mask_dict - has a breakpoint
* fix(azure.py): add timeout param + elapsed time in azure timeout error
* fix(http_handler.py): add elapsed time to http timeout request
makes it easier to debug how long request took before failing
* fix(azure.py): ensure max_retries=0 is respected
Fixes https://github.com/BerriAI/litellm/issues/6129
* fix(test_openai.py): add unit test to ensure openai sdk calls always respect max_retries = 0
* test(test_azure_openai.py): add unit testing for azure_text/ route
* fix(azure.py): fix passing max retries on streaming
* fix(azure.py): fix azure max retries on async completion + streaming
* fix(completion/handler.py): fix azure text async completion + streaming
* test(test_azure_openai.py): ensure azure openai max retries always respected
* test(test_azure_o_series.py): add testing to ensure max retries always respected
* Added gemini providers for 2.0-flash and 2.0-flash lite (#8321)
* Update model_prices_and_context_window.json
added gemini providers for 2.0-flash and 2.0-flash light
* Update model_prices_and_context_window.json
fixed URL
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* Convert tool use arguments to string before counting tokens (#6989)
In at least some cases the `messages["tool_calls"]["function"]["arguments"]` is a dict, not a string. In order to tokenize it properly it needs to be a string. In the case that it is already a string this is a noop, which is also fine.
* build(model_prices_and_context_window.json): add gemini 2.0 flash lite pricing
* build(model_prices_and_context_window.json): add gemini commercial rate limits
* fix(utils.py): fix linting error
* refactor(utils.py): refactor to maintain function size
---------
Co-authored-by: Bardia Khosravi <bardiakhosravi95@gmail.com>
Co-authored-by: Josh Morrow <josh@jcmorrow.com>
* feat(handle_jwt.py): initial commit to allow scope based model access
* feat(handle_jwt.py): allow model access based on token scopes
allow admin to control model access from IDP
* test(test_jwt.py): add unit testing for scope based model access
* docs(token_auth.md): add scope based model access to docs
* docs(token_auth.md): update docs
* docs(token_auth.md): update docs
* build: add gemini commercial rate limits
* fix: fix linting error
* add initial test for assembly ai
* start using PassthroughEndpointRouter
* migrate to lllm passthrough endpoints
* add assembly ai as a known provider
* fix PassthroughEndpointRouter
* fix set_pass_through_credentials
* working EU request to assembly ai pass through endpoint
* add e2e test assembly
* test_assemblyai_routes_with_bad_api_key
* clean up pass through endpoint router
* e2e testing for assembly ai pass through
* test assembly ai e2e testing
* delete assembly ai models
* fix code quality
* ui working assembly ai api base flow
* fix install assembly ai
* update model call details with kwargs for pass through logging
* fix tracking assembly ai model in response
* _handle_assemblyai_passthrough_logging
* fix test_initialize_deployment_for_pass_through_unsupported_provider
* TestPassthroughEndpointRouter
* _get_assembly_transcript
* fix assembly ai pt logging tests
* fix assemblyai_proxy_route
* fix _get_assembly_region_from_url
* fix(utils.py): handle key error in msg validation
* Support running Aim Guard during LLM call (#7918)
* support running Aim Guard during LLM call
* Rename header
* adjust docs and fix type annotations
* fix(timeout.md): doc fix for openai example on dynamic timeouts
---------
Co-authored-by: Tomer Bin <117278227+hxtomer@users.noreply.github.com>
* refactor _get_langfuse_input_output_content
* test_langfuse_logging_completion_with_malformed_llm_response
* fix _get_langfuse_input_output_content
* fixes for langfuse linting
* unit testing for get chat/text content for langfuse
* fix _should_raise_content_policy_error
* fix(convert_dict_to_response.py): only convert if response is the response_format tool call passed in
Fixes https://github.com/BerriAI/litellm/issues/8241
* fix(gpt_transformation.py): makes sure response format / tools conversion doesn't remove previous tool calls
* refactor(gpt_transformation.py): refactor out json schema converstion to base config
keeps logic consistent across providers
* fix(o_series_transformation.py): support o3 mini native streaming
Fixes https://github.com/BerriAI/litellm/issues/8274
* fix(gpt_transformation.py): remove unused variables
* test: update test
* initial transform for invoke
* invoke transform_response
* working - able to make request
* working get_complete_url
* working - invoke now runs on llm_http_handler
* fix unused imports
* track litellm overhead ms
* working stream request
* sign_request transform
* sign_request update
* use has_async_custom_stream_wrapper property
* use get_async_custom_stream_wrapper in base llm http handler
* fix make_call in invoke handler
* fix invoke with streaming get_async_custom_stream_wrapper
* working bedrock async streaming with invoke
* fix make call handler for bedrock
* test_all_model_configs
* fix test_bedrock_custom_prompt_template
* sync streaming for bedrock invoke
* fix _add_stream_param_to_request_body
* test_async_text_completion_bedrock
* fix transform_request
* fix get_supported_openai_params
* fix test supports tool choice
* fix test_supports_tool_choice
* add unit test coverage for bedrock invoke transform
* fix location of transformation files
* update import loc
* fix bedrock invoke unit tests
* fix import for max completion tokens
* remove code block upserting master key hash to db
* run test to check if key upserted into db
* run ci/cd again
* litellm_proxy_security_tests
* litellm_proxy_security_tests
* run prisma entrypoint
* ci/cd run again
* fix test master key not in db
* refactor(deepseek/): move deepseek to base llm http handler
Fixes https://github.com/BerriAI/litellm/issues/8128#issuecomment-2635430457
* fix(gpt_transformation.py): support stream parsing for gpt-like calls
* test(test_deepseek_completion.py): add async streaming test
* fix(gpt_transformation.py): fix import
* fix(gpt_transformation.py): return full api base and content type
* feat(proxy/_types.py): add new jwt field params
allows users + services to auth into proxy
* feat(handle_jwt.py): allow team role proxy access
allows proxy admin to set allowed team roles
* fix(proxy/_types.py): add 'routes' to role based permissions
allow proxy admin to restrict what routes a team can access easily
* feat(handle_jwt.py): support more flexible role based route access
v2 on role based 'allowed_routes'
* test(test_jwt.py): add unit test for rbac for proxy routes
* feat(handle_jwt.py): ensure cost tracking always works for any jwt request with `enforce_rbac=True`
* docs(token_auth.md): add documentation on controlling model access via OIDC Roles
* test: increase time delay before retrying
* test: handle model overloaded for test
* fix _decrypt_and_set_db_env_variables
* fix proxy config
* test callbacks in DB
* test langfuse callbacks in db
* test_e2e_langfuse_callbacks_in_db
* proxy_store_model_in_db_tests
* fix proxy_store_model_in_db_tests
* proxy_store_model_in_db_tests
* fix store_model_db_config.yaml
* fix check_langfuse_request
* fix test langfuse base url
* ci/cd run again
* track org id in spend logs
* read org id from team table
* show user_api_key_org_id in spend logs
* test_spend_logs_payload
* test_spend_logs_with_org_id
* test_spend_logs_with_org_id
* fix(prometheus.py): fix setting key budget metrics
ensures custom metadata works with key budget metric
this is a patch. root cause pr is written in a separate branch
* test: fix test
* fix(key_management_endpoints.py): fix vulnerability where a user could update another user's keys
Resolves https://github.com/BerriAI/litellm/issues/8031
* test(key_management_endpoints.py): return consistent 403 forbidden error when modifying key that doesn't belong to user
* fix(internal_user_endpoints.py): return model max budget in internal user create response
Fixes https://github.com/BerriAI/litellm/issues/7047
* test: fix test
* test: update test to handle gemini token counter change
* fix(factory.py): fix bedrock http:// handling
* docs: fix typo in lm_studio.md (#8222)
* test: fix testing
* test: fix test
---------
Co-authored-by: foreign-sub <51928805+foreign-sub@users.noreply.github.com>
* add assembly ai pass through request
* fix assembly pass through
* fix test_assemblyai_basic_transcribe
* fix assemblyai auth check
* test_assemblyai_transcribe_with_non_admin_key
* working assembly ai test
* working assembly ai proxy route
* use helper func to pass through logging
* clean up logging assembly ai
* test: update test to handle gemini token counter change
* fix(factory.py): fix bedrock http:// handling
* add unit testing for assembly pt handler
* docs assembly ai pass through endpoint
* fix proxy_pass_through_endpoint_tests
* fix standard_passthrough_logging_object
* fix ASSEMBLYAI_API_KEY
* test test_assemblyai_proxy_route_basic_post
* test_assemblyai_proxy_route_get_transcript
* fix is is_assemblyai_route
* test_is_assemblyai_route
---------
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
* test(base_llm_unit_tests.py): add test to ensure drop params is respected
* fix(types/prometheus.py): use typing_extensions for python3.8 compatibility
* build: add cherry picked commits
* fix(vertex_ai/gemini/transformation.py): handle 'http://' image urls
* test: add base test for `http:` url's
* fix(factory.py/get_image_details): follow redirects
allows http calls to work
* fix(codestral/): fix stream chunk parsing on last chunk of stream
* Azure ad token provider (#6917)
* Update azure.py
Added optional parameter azure ad token provider
* Added parameter to main.py
* Found token provider arg location
* Fixed embeddings
* Fixed ad token provider
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* fix: fix linting errors
* fix(main.py): leave out o1 route for azure ad token provider, for now
get v0 out for sync azure gpt route to begin with
* test: skip http:// test for fireworks ai
model does not support it
* refactor: cleanup dead code
* fix: revert http:// url passthrough for gemini
google ai studio raises errors
* test: fix test
---------
Co-authored-by: bahtman <anton@baht.dk>
* fix(ui_sso.py): use common `get_user_object` logic across jwt + ui sso auth
Allows finding users by their email, and attaching the sso user id to the user if found
* Improve Team Management flow on UI (#8204)
* build(teams.tsx): refactor teams page to make it easier to add members to a team
make a row in table clickable -> allows user to add users to team they intended
* build(teams.tsx): make it clear user should click on team id to view team details
simplifies team management by putting team details on separate page
* build(team_info.tsx): separately show user id and user email
make it easy for user to understand the information they're seeing
* build(team_info.tsx): add back in 'add member' button
* build(team_info.tsx): working team member update on team_info.tsx
* build(team_info.tsx): enable team member delete on ui
allow user to delete accidental adds
* build(internal_user_endpoints.py): expose new endpoint for ui to allow filtering on user table
allows proxy admin to quickly find user they're looking for
* feat(team_endpoints.py): expose new team filter endpoint for ui
allows proxy admin to easily find team they're looking for
* feat(user_search_modal.tsx): allow admin to filter on users when adding new user to teams
* test: mark flaky test
* test: mark flaky test
* fix(exception_mapping_utils.py): fix anthropic text route error
* fix(ui_sso.py): handle situation when user not in db