* build(schema.prisma): add new `sso_user_id` to LiteLLM_UserTable
easier way to store sso id for existing user
Allows existing user added to team, to login via SSO
* test(test_auth_checks.py): add unit testing for fuzzy user object get
* fix(handle_jwt.py): fix merge conflicts
* docs(token_auth.md): clarify title
* refactor(handle_jwt.py): add jwt auth manager + refactor to handle groups
allows user to call model if user belongs to group with model access
* refactor(handle_jwt.py): refactor to first check if service call then check user call
* feat(handle_jwt.py): new `enforce_team_access` param
only allows user to call model if a team they belong to has model access
allows controlling user model access by team
* fix(handle_jwt.py): fix error string, remove unecessary param
* docs(token_auth.md): add controlling model access for jwt tokens via teams to docs
* test: fix tests post refactor
* fix: fix linting errors
* fix: fix linting error
* test: fix import error
* feat(handle_jwt.py): initial commit adding custom RBAC support on jwt auth
allows admin to define user role field and allowed roles which map to 'internal_user' on litellm
* fix(auth_checks.py): ensure user allowed to access model, when calling via personal keys
Fixes https://github.com/BerriAI/litellm/issues/8029
* feat(handle_jwt.py): support role based access with model permission control on proxy
Allows admin to just grant users roles on IDP (e.g. Azure AD/Keycloak) and user can immediately start calling models
* docs(rbac): add docs on rbac for model access control
make it clear how admin can use roles to control model access on proxy
* fix: fix linting errors
* test(test_user_api_key_auth.py): add unit testing to ensure rbac role is correctly enforced
* test(test_user_api_key_auth.py): add more testing
* test(test_users.py): add unit testing to ensure user model access is always checked for new keys
Resolves https://github.com/BerriAI/litellm/issues/8029
* test: fix unit test
* fix(dot_notation_indexing.py): fix typing to work with python 3.8
* fix message.error
* fix add return_wildcard_routes
* ui edit modelAvailableCall
* fetchAvailableModelsForTeamOrKey
* ui set all models for a team
* ui define common helpers
* edit create key button
* fix viewing model display names
* fix editing team models
* update gitignore
* add jest testing for ui
* Revert "add jest testing for ui"
This reverts commit 98f9a3ebfd.
* fix(http_handler.py): support passing ssl verify dynamically and using the correct httpx client based on passed ssl verify param
Fixes https://github.com/BerriAI/litellm/issues/6499
* feat(llm_http_handler.py): support passing `ssl_verify=False` dynamically in call args
Closes https://github.com/BerriAI/litellm/issues/6499
* fix(proxy/utils.py): prevent bad logs from breaking all cost tracking + reset list regardless of success/failure
prevents malformed logs from causing all spend tracking to break since they're constantly retried
* test(test_proxy_utils.py): add test to ensure bad log is dropped
* test(test_proxy_utils.py): ensure in-memory spend logs reset after bad log error
* test(test_user_api_key_auth.py): add unit test to ensure end user id as str works
* fix(auth_utils.py): ensure extracted end user id is always a str
prevents db cost tracking errors
* test(test_auth_utils.py): ensure get end user id from request body always returns a string
* test: update tests
* test: skip bedrock test- behaviour now supported
* test: fix testing
* refactor(spend_tracking_utils.py): reduce size of get_logging_payload
* test: fix test
* bump: version 1.59.4 → 1.59.5
* Revert "bump: version 1.59.4 → 1.59.5"
This reverts commit 1182b46b2e.
* fix(utils.py): fix spend logs retry logic
* fix(spend_tracking_utils.py): fix get tags
* fix(spend_tracking_utils.py): fix end user id spend tracking on pass-through endpoints
* fix(user_dashboard.tsx): fix spend calculation when team selected
sum all team keys, not user keys
* docs(admin_ui_sso.md): fix docs tabbing
* feat(user_api_key_auth.py): introduce new 'enforce_rbac' param on jwt auth
allows proxy admin to prevent any unmapped yet authenticated jwt tokens from calling proxy
Fixes https://github.com/BerriAI/litellm/issues/6793
* test: more unit testing + refactoring
* fix: fix returning id when obj not found in db
* fix(user_api_key_auth.py): add end user id tracking from jwt auth
* docs(token_auth.md): add doc on rbac with JWTs
* fix: fix unused params
* test: remove old test
* fix(user_api_key_auth.py): handle clientside fallback model when item in list is dictionary
* fix(auth_checks.py): help user find invalid model names during dev
Ensure fallbacks work in prod
* fix(user_api_key_auth.py): fix linting check
* fix: cleanup unused variables
* fix: fix import
* fix(auth_checks.py): fix auth check
* test: initial test to enforce all functions in user_api_key_auth.py have direct testing
* test(test_user_api_key_auth.py): add is_allowed_route unit test
* test(test_user_api_key_auth.py): add more tests
* test(test_user_api_key_auth.py): add complete testing coverage for all functions in `user_api_key_auth.py`
* test(test_db_schema_changes.py): add a unit test to ensure all db schema changes are backwards compatible
gives user an easy rollback path
* test: fix schema compatibility test filepath
* test: fix test
* feat(pass_through_endpoints.py): fix anthropic end user cost tracking
* fix(anthropic/chat/transformation.py): use returned provider model for anthropic
handles anthropic `-latest` tag in request body throwing cost calculation errors
ensures we can be accurate in our model cost tracking
* feat(model_prices_and_context_window.json): add gemini-2.0-flash-thinking-exp pricing
* test: update test to use assumption that user_api_key_dict can get anthropic user id
* test: fix test
* fix: fix test
* fix(anthropic_pass_through.py): uncomment previous anthropic end-user cost tracking code block
can't guarantee user api key dict always has end user id - too many code paths
* fix(user_api_key_auth.py): this allows end user id from request body to always be read and set in auth object
* fix(auth_check.py): fix linting error
* test: fix auth check
* fix(auth_utils.py): fix get end user id to handle metadata = None
* fix(gpt_transformation.py): fix response_format translation check for 4o models
Fixes https://github.com/BerriAI/litellm/issues/7616
* feat(key_management_endpoints.py): support 'temp_budget_increase' and 'temp_budget_expiry' fields
Allow proxy admin to grant temporary budget increases to keys
* fix(proxy/_types.py): enforce temp_budget_increase and temp_budget_expiry are always passed together
* feat(user_api_key_auth.py): initial working temp budget increase logic
ensures key budget exceeded error checks for temp budget in key metadata
* feat(proxy_server.py): return the key max budget and key spend in the response headers
Allows clientside user to know their remaining limits
* test: add unit testing for new proxy utils
Ensures new key budget is correctly handled
* docs(temporary_budget_increase.md): add doc on temporary budget increase
* fix(utils.py): remove 3.5 from response_format check for now
not all azure 3.5 models support response_format
* fix(user_api_key_auth.py): return valid user api key auth object on all paths
* feat(ui_sso.py): support reading team ids from sso token
* feat(ui_sso.py): working upsert sso user teams membership in litellm - if team exists
Adds user to relevant teams, if user is part of teams and team exists on litellm
* fix(ui_sso.py): safely handle add team member task
* build(ui/): support setting team id when creating team on UI
* build(ui/): teams.tsx
allow setting team id on ui
* build(circle_ci/requirements.txt): add fastapi-sso to ci/cd testing
* fix: fix linting errors
* feat(main.py): mock_response() - support 'litellm.ContextWindowExceededError' in mock response
enabled quicker router/fallback/proxy debug on context window errors
* feat(exception_mapping_utils.py): extract special litellm errors from error str if calling `litellm_proxy/` as provider
Closes https://github.com/BerriAI/litellm/issues/7259
* fix(user_api_key_auth.py): specify 'Received Proxy Server Request' is span kind server
Closes https://github.com/BerriAI/litellm/issues/7298
* feat(proxy/utils.py): get associated litellm budget from db in combined_view for key
allows user to create rate limit tiers and associate those to keys
* feat(proxy/_types.py): update the value of key-level tpm/rpm/model max budget metrics with the associated budget table values if set
allows rate limit tiers to be easily applied to keys
* docs(rate_limit_tiers.md): add doc on setting rate limit / budget tiers
make feature discoverable
* feat(key_management_endpoints.py): return litellm_budget_table value in key generate
make it easy for user to know associated budget on key creation
* fix(key_management_endpoints.py): document 'budget_id' param in `/key/generate`
* docs(key_management_endpoints.py): document budget_id usage
* refactor(budget_management_endpoints.py): refactor budget endpoints into separate file - makes it easier to run documentation testing against it
* docs(test_api_docs.py): add budget endpoints to ci/cd doc test + add missing param info to docs
* fix(customer_endpoints.py): use new pydantic obj name
* docs(user_management_heirarchy.md): add simple doc explaining teams/keys/org/users on litellm
* Litellm dev 12 26 2024 p2 (#7432)
* (Feat) Add logging for `POST v1/fine_tuning/jobs` (#7426)
* init commit ft jobs logging
* add ft logging
* add logging for FineTuningJob
* simple FT Job create test
* (docs) - show all supported Azure OpenAI endpoints in overview (#7428)
* azure batches
* update doc
* docs azure endpoints
* docs endpoints on azure
* docs azure batches api
* docs azure batches api
* fix(key_management_endpoints.py): fix key update to actually work
* test(test_key_management.py): add e2e test asserting ui key update call works
* fix: proxy/_types - fix linting erros
* test: update test
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
* fix: test
* fix(parallel_request_limiter.py): enforce tpm/rpm limits on key from tiers
* fix: fix linting errors
* test: fix test
* fix: remove unused import
* test: update test
* docs(customer_endpoints.py): document new model_max_budget param
* test: specify unique key alias
* docs(budget_management_endpoints.py): document new model_max_budget param
* test: fix test
* test: fix tests
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
* fix(proxy_server.py): enforce team id based model add only works if enterprise user
* fix(auth_checks.py): enforce common_checks can only be imported by user_api_key_auth.py
* fix(auth_checks.py): insert not premium user error message on failed common checks run
* ui fix - allow searching model list + fix bug on filtering
* qa fix - use correct provider name for azure_text
* ui wrap content onto next line
* ui fix - allow selecting current UI session when logging in
* ui session budgets
* ui show provider models on wildcard models
* test provider name appears in model list
* ui fix auto scroll on chat ui tab
* fix(proxy_track_cost_callback.py): log to db if only end user param given
* fix: allows for jwt-auth based end user id spend tracking to work
* fix(utils.py): fix 'get_end_user_id_for_cost_tracking' to use 'user_api_key_end_user_id'
more stable - works with jwt-auth based end user tracking as well
* test(test_jwt.py): add e2e unit test to confirm end user cost tracking works for spend logs
* test: update test to use end_user api key hash param
* fix(langfuse.py): support end user cost tracking via jwt auth + langfuse
logs end user to langfuse if decoded from jwt token
* fix: fix linting errors
* test: fix test
* test: fix test
* fix: fix end user id extraction
* fix: run test earlier
* fix(proxy_server.py): pass model access groups to get_key/get_team models
allows end user to see actual models they have access to, instead of default models
* fix(auth_checks.py): fix linting errors
* fix: fix linting errors
* fix(azure/): support passing headers to azure openai endpoints
Fixes https://github.com/BerriAI/litellm/issues/6217
* fix(utils.py): move default tokenizer to just openai
hf tokenizer makes network calls when trying to get the tokenizer - this slows down execution time calls
* fix(router.py): fix pattern matching router - add generic "*" to it as well
Fixes issue where generic "*" model access group wouldn't show up
* fix(pattern_match_deployments.py): match to more specific pattern
match to more specific pattern
allows setting generic wildcard model access group and excluding specific models more easily
* fix(proxy_server.py): fix _delete_deployment to handle base case where db_model list is empty
don't delete all router models b/c of empty list
Fixes https://github.com/BerriAI/litellm/issues/7196
* fix(anthropic/): fix handling response_format for anthropic messages with anthropic api
* fix(fireworks_ai/): support passing response_format + tool call in same message
Addresses https://github.com/BerriAI/litellm/issues/7135
* Revert "fix(fireworks_ai/): support passing response_format + tool call in same message"
This reverts commit 6a30dc6929.
* test: fix test
* fix(replicate/): fix replicate default retry/polling logic
* test: add unit testing for router pattern matching
* test: update test to use default oai tokenizer
* test: mark flaky test
* test: skip flaky test
* catch DB_CONNECTION_ERROR_TYPES
* fix DB retry mechanism for SpendLog updates
* use DB_CONNECTION_ERROR_TYPES in auth checks
* fix exp back off for writing SpendLogs
* use _raise_failed_update_spend_exception to ensure errors print as NON blocking
* test_update_spend_logs_multiple_batches_with_failure
* fix(edit_budget_modal.tsx): call `/budget/update` endpoint instead of `/budget/new`
allows updating existing budget on ui
* fix(user_api_key_auth.py): support cost tracking for end user via jwt field
* fix(presidio.py): support pii masking on sync logging callbacks
enables masking before logging to langfuse
* feat(utils.py): support retry policy logic inside '.completion()'
Fixes https://github.com/BerriAI/litellm/issues/6623
* fix(utils.py): support retry by retry policy on async logic as well
* fix(handle_jwt.py): set leeway default leeway value
* test: fix test to handle jwt audience claim
* get_api_key_from_custom_header
* add test_get_api_key_from_custom_header
* fix testing use 1 file for test user api key auth
* fix test user api key auth
* test_custom_api_key_header_name
* fix(key_management_endpoints.py): override metadata field value on update
allow user to override tags
* feat(__init__.py): expose new disable_end_user_cost_tracking_prometheus_only metric
allow disabling end user cost tracking on prometheus - fixes cardinality issue
* fix(litellm_pre_call_utils.py): add key/team level enforced params
Fixes https://github.com/BerriAI/litellm/issues/6652
* fix(key_management_endpoints.py): allow user to pass in `enforced_params` as a top level param on /key/generate and /key/update
* docs(enterprise.md): add docs on enforcing required params for llm requests
* Add support of Galadriel API (#7005)
* fix(router.py): robust retry after handling
set retry after time to 0 if >0 healthy deployments. handle base case = 1 deployment
* test(test_router.py): fix test
* feat(bedrock/): add support for 'nova' models
also adds explicit 'converse/' route for simpler routing
* fix: fix 'supports_pdf_input'
return if model supports pdf input on get_model_info
* feat(converse_transformation.py): support bedrock pdf input
* docs(document_understanding.md): add document understanding to docs
* fix(litellm_pre_call_utils.py): fix linting error
* fix(init.py): fix passing of bedrock converse models
* feat(bedrock/converse): support 'response_format={"type": "json_object"}'
* fix(converse_handler.py): fix linting error
* fix(base_llm_unit_tests.py): fix test
* fix: fix test
* test: fix test
* test: fix test
* test: remove duplicate test
---------
Co-authored-by: h4n0 <4738254+h4n0@users.noreply.github.com>
* fix get_standard_logging_object_payload
* fix async_post_call_failure_hook
* fix post_call_failure_hook
* fix change
* fix _is_proxy_only_error
* fix async_post_call_failure_hook
* fix getting request body
* remove redundant code
* use a well named original function name for auth errors
* fix logging auth fails on DD
* fix using request body
* use helper for _handle_logging_proxy_only_error
* fix(factory.py): ensure tool call converts image url
Fixes https://github.com/BerriAI/litellm/issues/6953
* fix(transformation.py): support mp4 + pdf url's for vertex ai
Fixes https://github.com/BerriAI/litellm/issues/6936
* fix(http_handler.py): mask gemini api key in error logs
Fixes https://github.com/BerriAI/litellm/issues/6963
* docs(prometheus.md): update prometheus FAQs
* feat(auth_checks.py): ensure specific model access > wildcard model access
if wildcard model is in access group, but specific model is not - deny access
* fix(auth_checks.py): handle auth checks for team based model access groups
handles scenario where model access group used for wildcard models
* fix(internal_user_endpoints.py): support adding guardrails on `/user/update`
Fixes https://github.com/BerriAI/litellm/issues/6942
* fix(key_management_endpoints.py): fix prepare_metadata_fields helper
* fix: fix tests
* build(requirements.txt): bump openai dep version
fixes proxies argument
* test: fix tests
* fix(http_handler.py): fix error message masking
* fix(bedrock_guardrails.py): pass in prepped data
* test: fix test
* test: fix nvidia nim test
* fix(http_handler.py): return original response headers
* fix: revert maskedhttpstatuserror
* test: update tests
* test: cleanup test
* fix(key_management_endpoints.py): fix metadata field update logic
* fix(key_management_endpoints.py): maintain initial order of guardrails in key update
* fix(key_management_endpoints.py): handle prepare metadata
* fix: fix linting errors
* fix: fix linting errors
* fix: fix linting errors
* fix: fix key management errors
* fix(key_management_endpoints.py): update metadata
* test: update test
* refactor: add more debug statements
* test: skip flaky test
* test: fix test
* fix: fix test
* fix: fix update metadata logic
* fix: fix test
* ci(config.yml): change db url for e2e ui testing
* fix(key_management_endpoints.py): fix user-membership check when creating team key
* docs: add deprecation notice on original `/v1/messages` endpoint + add better swagger tags on pass-through endpoints
* fix(gemini/): fix image_url handling for gemini
Fixes https://github.com/BerriAI/litellm/issues/6897
* fix(teams.tsx): fix member add when role is 'user'
* fix(team_endpoints.py): /team/member_add
fix adding several new members to team
* test(test_vertex.py): remove redundant test
* test(test_proxy_server.py): fix team member add tests
* feat - allow using gemini js SDK with LiteLLM
* add auth for gemini_proxy_route
* basic local test for js
* test cost tagging gemini js requests
* add js sdk test for gemini with litellm
* add docs on gemini JS SDK
* run node.js tests
* fix google ai studio tests
* fix vertex js spend test