* fix(user_dashboard.tsx): fix spend calculation when team selected
sum all team keys, not user keys
* docs(admin_ui_sso.md): fix docs tabbing
* feat(user_api_key_auth.py): introduce new 'enforce_rbac' param on jwt auth
allows proxy admin to prevent any unmapped yet authenticated jwt tokens from calling proxy
Fixes https://github.com/BerriAI/litellm/issues/6793
* test: more unit testing + refactoring
* fix: fix returning id when obj not found in db
* fix(user_api_key_auth.py): add end user id tracking from jwt auth
* docs(token_auth.md): add doc on rbac with JWTs
* fix: fix unused params
* test: remove old test
* feat(health_check.py): set upperbound for api when making health check call
prevent bad model from health check to hang and cause pod restarts
* fix(health_check.py): cleanup task once completed
* fix(constants.py): bump default health check timeout to 1min
* docs(health.md): add 'health_check_timeout' to health docs on litellm
* build(proxy_server_config.yaml): add bad model to health check
* fix(gpt_transformation.py): fix response_format translation check for 4o models
Fixes https://github.com/BerriAI/litellm/issues/7616
* feat(key_management_endpoints.py): support 'temp_budget_increase' and 'temp_budget_expiry' fields
Allow proxy admin to grant temporary budget increases to keys
* fix(proxy/_types.py): enforce temp_budget_increase and temp_budget_expiry are always passed together
* feat(user_api_key_auth.py): initial working temp budget increase logic
ensures key budget exceeded error checks for temp budget in key metadata
* feat(proxy_server.py): return the key max budget and key spend in the response headers
Allows clientside user to know their remaining limits
* test: add unit testing for new proxy utils
Ensures new key budget is correctly handled
* docs(temporary_budget_increase.md): add doc on temporary budget increase
* fix(utils.py): remove 3.5 from response_format check for now
not all azure 3.5 models support response_format
* fix(user_api_key_auth.py): return valid user api key auth object on all paths
* build(model_prices_and_context_window.json): add azure o1 pricing
Closes https://github.com/BerriAI/litellm/issues/7712
* refactor: replace regex with string method for whitespace check in stop-sequences handling (#7713)
* Allows overriding keep_alive time in ollama (#7079)
* Allows overriding keep_alive time in ollama
* Also adds to ollama_chat
* Adds some info on the docs about this parameter
* fix: together ai warning (#7688)
Co-authored-by: Carl Senze <carl.senze@aleph-alpha.com>
* fix(proxy_server.py): handle config containing thread locked objects when using get_config_state
* fix(proxy_server.py): add exception to debug
* build(model_prices_and_context_window.json): update 'supports_vision' for azure o1
---------
Co-authored-by: Wolfram Ravenwolf <52386626+WolframRavenwolf@users.noreply.github.com>
Co-authored-by: Regis David Souza Mesquita <github@rdsm.dev>
Co-authored-by: Carl <45709281+capsenz@users.noreply.github.com>
Co-authored-by: Carl Senze <carl.senze@aleph-alpha.com>
* fix(model_hub.tsx): clarify cost in model hub is per 1m tokens
* docs: test blog
* docs: improve release note docs
* docs(docs/): new stable release doc
* docs(docs/): specify date in all posts
* docs(docs/): add git diff to stable release docs
* feat(langfuse.py): log the used prompt when prompt management used
* test: fix test
* docs(self_serve.md): add doc on restricting personal key creation on ui
* feat(s3.py): support s3 logging with team alias prefixes (if available)
New preview feature
* fix(main.py): remove old if block - simplify to just await if coroutine returned
fixes lm_studio async embedding error
* fix(langfuse.py): handle get prompt check
* test(test_basic_python_version.py): assert all optional dependencies are marked as extras on poetry
Fixes https://github.com/BerriAI/litellm/issues/7677
* docs(secret.md): clarify 'read_and_write' secret manager usage on aws
* docs(secret.md): fix doc
* build(ui/teams.tsx): add edit/delete button for updating user / team membership on ui
allows updating user role to admin on ui
* build(ui/teams.tsx): display edit member component on ui, when edit button on member clicked
* feat(team_endpoints.py): support updating team member role to admin via api endpoints
allows team member to become admin post-add
* build(ui/user_dashboard.tsx): if team admin - show all team keys
Fixes https://github.com/BerriAI/litellm/issues/7650
* test(config.yml): add tomli to ci/cd
* test: don't call python_basic_testing in local testing (covered by python 3.13 testing)
* test(test_get_model_info.py): add unit test confirming router deployment updates global 'get_model_info'
* fix(get_supported_openai_params.py): fix custom llm provider 'get_supported_openai_params'
Fixes https://github.com/BerriAI/litellm/issues/7668
* docs(azure.md): clarify how azure ad token refresh on proxy works
Closes https://github.com/BerriAI/litellm/issues/7665
* fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini process url
* refactor(router.py): refactor '_prompt_management_factory' to use logging obj get_chat_completion logic
deduplicates code
* fix(litellm_logging.py): update 'get_chat_completion_prompt' to update logging object messages
* docs(prompt_management.md): update prompt management to be in beta
given feedback - this still needs to be revised (e.g. passing in user message, not ignoring)
* refactor(prompt_management_base.py): introduce base class for prompt management
allows consistent behaviour across prompt management integrations
* feat(prompt_management_base.py): support adding client message to template message + refactor langfuse prompt management to use prompt management base
* fix(litellm_logging.py): log prompt id + prompt variables to langfuse if set
allows tracking what prompt was used for what purpose
* feat(litellm_logging.py): log prompt management metadata in standard logging payload + use in langfuse
allows logging prompt id / prompt variables to langfuse
* test: fix test
* fix(router.py): cleanup unused imports
* fix: fix linting error
* fix: fix trace param typing
* fix: fix linting errors
* fix: fix code qa check
* fix(main.py): fix lm_studio/ embedding routing
adds the mapping + updates docs with example
* docs(self_serve.md): update doc to show how to auto-add sso users to teams
* fix(streaming_handler.py): simplify async iterator check, to just check if streaming response is an async iterable
* fix working build from pip
* add tests for proxy_build_from_pip_tests
* doc clean up for deployment
* docs cleanup
* docs build from pip
* fix cd docker/build_from_pip
* docs(friendliai.md): update FriendliAI documentation and model details
* docs(friendliai.md): remove unused imports for cleaner documentation
* feat: add support for parallel function calling, system messages, and response schema in model configuration
* feat(router.py): support request prioritization for text completion calls
* fix(internal_user_endpoints.py): fix sql query to return all keys, including null team id keys on `/user/info`
Fixes https://github.com/BerriAI/litellm/issues/7485
* fix: fix linting errors
* fix: fix linting error
* test(test_router_helper_utils.py): add direct test for '_schedule_factory'
Fixes code qa test
* test(test_utils.py): initial test for valid models
Addresses https://github.com/BerriAI/litellm/issues/7525
* fix: test
* feat(fireworks_ai/transformation.py): support retrieving valid models from fireworks ai endpoint
* refactor(fireworks_ai/): support checking model info on `/v1/models` route
* docs(set_keys.md): update docs to clarify check llm provider api usage
* fix(watsonx/common_utils.py): support 'WATSONX_ZENAPIKEY' for iam auth
* fix(watsonx): read in watsonx token from env var
* fix: fix linting errors
* fix(utils.py): fix provider config check
* style: cleanup unused imports