* refactor get model info for team models
* allow adding a model to a team when creating team specific model
* ui update selected Team on Team Dropdown
* test_team_model_association
* testing for team specific models
* test_get_team_specific_model
* test: skip on internal server error
* remove model alias card on teams page
* linting fix _get_team_specific_model
* fix DeploymentTypedDict
* fix linting error
* fix code quality
* fix model info checks
---------
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
* ui - use common team dropdown component
* re-use team component
* rename org field on add model
* handle add model submit
* working view model_id and team_id on root models page
* cleaner
* show all fields
* working model info view
* working team info selector
* clean up team id
* new component for model dashboard
* ui show table with dropdown
* make public model names like email
* revert changes to litellm model name
* fix litellm model name
* ui fix public model
* fix mappings
* fix conditional text input
* fix message
* ui fix bulk add models
* _add_team_model_to_db
* move model mgmt helper funcs
* test_add_team_model_to_db
* ui - display model team model name
* fix add model tab
* fix remove redundant info tab on models page
* dont pass model mappings all the way through
* fix jarring model name when adding team models
* fix edit model button
* delete button on model info
* ui fix model dashboard
* fix DeploymentTypedDict
* _is_model_access_group_for_wildcard_route
* test _get_public_model_name
* ui fix viewing public model name
* fix linting error
* fix linting errors
* fix selectedModel logic
* use class ResetBudgetJob
* refactor reset budget job
* update reset_budget job
* refactor reset budget job
* fix LiteLLM_UserTable
* refactor reset budget job
* add telemetry for reset budget job
* dd - log service success/failure on DD
* add detailed reset budget reset info on DD
* initialize_scheduled_background_jobs
* refactor reset budget job
* trigger service failure hook when fails to reset a budget for team, key, user
* fix resetBudgetJob
* unit testing for ResetBudgetJob
* test_duration_in_seconds_basic
* testing for triggering service logging
* fix logs on test teams fail
* remove unused imports
* fix import duration in s
* duration_in_seconds
* fix(main.py): fix key leak error when unknown provider given
don't return passed in args if unknown route on embedding
* fix(main.py): remove instances of {args} being passed in exception
prevent potential key leaks
* test(code_coverage/prevent_key_leaks_in_codebase.py): ban usage of {args} in codebase
* fix: fix linting errors
* fix: remove unused variable
* fix(model_checks.py): update returning known model from wildcard to filter based on given model prefix
ensures wildcard route - `vertex_ai/gemini-*` just returns known vertex_ai/gemini- models
* test(test_proxy_utils.py): add unit testing for new 'get_known_models_from_wildcard' helper
* test(test_models.py): add e2e testing for `/model_group/info` endpoint
* feat(prometheus.py): support tracking total requests by user_email on prometheus
adds initial support for tracking total requests by user_email
* test(test_prometheus.py): add testing to ensure user email is always tracked
* test: update testing for new prometheus metric
* test(test_prometheus_unit_tests.py): add user email to total proxy metric
* test: update tests
* test: fix spend tests
* test: fix test
* fix(pagerduty.py): fix linting error
* fix(litellm_logging.py): support saving applied guardrails in logging object
allows list of applied guardrails to be logged for proxy admin's knowledge
* feat(spend_tracking_utils.py): log applied guardrails to spend logs
makes it easy for admin to know what guardrails were applied on a request
* ci(config.yml): uninstall posthog from ci/cd
* test: fix tests
* test: update test
* add initial test for assembly ai
* start using PassthroughEndpointRouter
* migrate to lllm passthrough endpoints
* add assembly ai as a known provider
* fix PassthroughEndpointRouter
* fix set_pass_through_credentials
* working EU request to assembly ai pass through endpoint
* add e2e test assembly
* test_assemblyai_routes_with_bad_api_key
* clean up pass through endpoint router
* e2e testing for assembly ai pass through
* test assembly ai e2e testing
* delete assembly ai models
* fix code quality
* ui working assembly ai api base flow
* fix install assembly ai
* update model call details with kwargs for pass through logging
* fix tracking assembly ai model in response
* _handle_assemblyai_passthrough_logging
* fix test_initialize_deployment_for_pass_through_unsupported_provider
* TestPassthroughEndpointRouter
* _get_assembly_transcript
* fix assembly ai pt logging tests
* fix assemblyai_proxy_route
* fix _get_assembly_region_from_url
* test(base_llm_unit_tests.py): add test to ensure drop params is respected
* fix(types/prometheus.py): use typing_extensions for python3.8 compatibility
* build: add cherry picked commits
* fix(o_series_transformation.py): add 'reasoning_effort' as o series model param
Closes https://github.com/BerriAI/litellm/issues/8182
* fix(main.py): ensure `reasoning_effort` is a mapped openai param
* refactor(azure/): rename o1_[x] files to o_series_[x]
* refactor(base_llm_unit_tests.py): refactor testing for o series reasoning effort
* test(test_azure_o_series.py): have azure o series tests correctly inherit from base o series model tests
* feat(base_utils.py): support translating 'developer' role to 'system' role for non-openai providers
Makes it easy to switch from openai to anthropic
* fix: fix linting errors
* fix(base_llm_unit_tests.py): fix test
* fix(main.py): add missing param
* Litellm dev 01 29 2025 p4 (#8107)
* fix(key_management_endpoints.py): always get db team
Fixes https://github.com/BerriAI/litellm/issues/7983
* test(test_key_management.py): add unit test enforcing check_db_only is always true on key generate checks
* test: fix test
* test: skip gemini thinking
* Litellm dev 01 29 2025 p3 (#8106)
* fix(__init__.py): reduces size of __init__.py and reduces scope for errors by using correct param
* refactor(__init__.py): refactor init by cleaning up redundant params
* refactor(__init__.py): move more constants into constants.py
cleanup root
* refactor(__init__.py): more cleanup
* feat(__init__.py): expose new 'disable_hf_tokenizer_download' param
enables hf model usage in offline env
* docs(config_settings.md): document new disable_hf_tokenizer_download param
* fix: fix linting error
* fix: fix unsafe comparison
* test: fix test
* docs(public_teams.md): add doc showing how to expose public teams for users to join
* docs: add beta disclaimer on public teams
* test: update tests
* refactor(factory.py): refactor async bedrock message transformation to use async get request for image url conversion
improve latency of bedrock call
* test(test_bedrock_completion.py): add unit testing to ensure async image url get called for async bedrock call
* refactor(factory.py): refactor bedrock translation to use BedrockImageProcessor
reduces duplicate code
* fix(factory.py): fix bug not allowing pdf's to be processed
* fix(factory.py): fix bedrock converse document understanding with image url
* docs(bedrock.md): clarify all bedrock document types are supported
* refactor: cleanup redundant test + unused imports
* perf: improve perf with reusable clients
* test: fix test
* feat(main.py): use asyncio.sleep for mock_Timeout=true on async request
adds unit testing to ensure proxy does not fail if specific Openai requests hang (e.g. recent o1 outage)
* fix(streaming_handler.py): fix deepseek r1 return reasoning content on streaming
Fixes https://github.com/BerriAI/litellm/issues/7942
* Revert "fix(streaming_handler.py): fix deepseek r1 return reasoning content on streaming"
This reverts commit 7a052a64e3.
* fix(deepseek-r-1): return reasoning_content as a top-level param
ensures compatibility with existing tools that use it
* fix: fix linting error
* fix(bedrock/converse_handler.py): fix bedrock region name on async calls
* fix(utils.py): fix split model handling
Fixes bedrock cost calculation when region name is given
* feat(_health_endpoints.py): support health checking datadog integration
Closes https://github.com/BerriAI/litellm/issues/7921
* feat(router.py): add retry headers to response
makes it easy to add testing to ensure model-specific retries are respected
* fix(add_retry_headers.py): clarify attempted retries vs. max retries
* test(test_fallbacks.py): add test for checking if max retries set for model is respected
* test(test_fallbacks.py): assert values for attempted retries and max retries are as expected
* fix(utils.py): return timeout in litellm proxy response headers
* test(test_fallbacks.py): add test to assert model specific timeout used on timeout error
* test: add bad model with timeout to proxy
* fix: fix linting error
* fix(router.py): fix get model list from model alias
* test: loosen test restriction - account for other events on proxy
* feat(main.py): add new 'provider_specific_header' param
allows passing extra header for specific provider
* fix(litellm_pre_call_utils.py): add unit test for pre call utils
* test(test_bedrock_completion.py): skip test now that bedrock supports this
* fix(types/utils.py): support returning 'reasoning_content' for deepseek models
Fixes https://github.com/BerriAI/litellm/issues/7877#issuecomment-2603813218
* fix(convert_dict_to_response.py): return deepseek response in provider_specific_field
allows for separating openai vs. non-openai params in model response
* fix(utils.py): support 'provider_specific_field' in delta chunk as well
allows deepseek reasoning content chunk to be returned to user from stream as well
Fixes https://github.com/BerriAI/litellm/issues/7877#issuecomment-2603813218
* fix(watsonx/chat/handler.py): fix passing space id to watsonx on chat route
* fix(watsonx/): fix watsonx_text/ route with space id
* fix(watsonx/): qa item - also adds better unit testing for watsonx embedding calls
* fix(utils.py): rename to '..fields'
* fix: fix linting errors
* fix(utils.py): fix typing - don't show provider-specific field if none or empty - prevents default respons
e from being non-oai compatible
* fix: cleanup unused imports
* docs(deepseek.md): add docs for deepseek reasoning model
* fix(utils.py): don't pass 'anthropic-beta' header to vertex - will cause request to fail
* fix(utils.py): add flag to allow user to disable filtering invalid headers
ensure user can control behaviour
* style(utils.py): cleanup message
* test(test_utils.py): add unit test to cover invalid header filtering
* fix(proxy_server.py): fix custom openapi schema generation
* fix(utils.py): pass extra headers if set
* fix(main.py): fix image variation to use 'client' param
* refactor: initial commit for using separate sync vs. async transformation routes for bedrock
ensures no blocking calls e.g. when converting image url to b64
* perf(converse_transformation.py): make bedrock converse transformation async
asyncify's the bedrock message transformation - useful for handling image urls for bedrock
* fix(converse_handler.py): fix logging for async streaming
* style: cleanup unused imports
* feat(main.py): initial commit for `/image/variations` endpoint support
* refactor(base_llm/): introduce new base llm base config for image variation endpoints
* refactor(openai/image_variations/transformation.py): implement openai image variation transformation handler
* fix: test
* feat(openai/): working openai `/image/variation` endpoint calls via sdk
* feat(topaz/): topaz sync image variation call support
Addresses https://github.com/BerriAI/litellm/issues/7593
'
* fix(topaz/transformation.py): fix linting errors
* fix(openai/image_variations/handler.py): fix passing json data
* fix(main.py): image_variation/
support async image variation route - `aimage_variation`
* fix(test_get_model_info.py): fix test
* fix: cleanup unused imports
* feat(openai/): add async `/image/variations` endpoint support
* feat(topaz/): support async `/image/variations` calls
* fix: test
* fix(utils.py): fix get_model_info_helper for no model info w/ provider config
handles situation where model info is not known but provider config exists
* test(test_router_fallbacks.py): mark flaky test
* fix: fix unused imports
* test: bump otel load test perf threshold - accounts for current load tests hitting same server
* feat(langfuse.py): log the used prompt when prompt management used
* test: fix test
* docs(self_serve.md): add doc on restricting personal key creation on ui
* feat(s3.py): support s3 logging with team alias prefixes (if available)
New preview feature
* fix(main.py): remove old if block - simplify to just await if coroutine returned
fixes lm_studio async embedding error
* fix(langfuse.py): handle get prompt check
* fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini process url
* refactor(router.py): refactor '_prompt_management_factory' to use logging obj get_chat_completion logic
deduplicates code
* fix(litellm_logging.py): update 'get_chat_completion_prompt' to update logging object messages
* docs(prompt_management.md): update prompt management to be in beta
given feedback - this still needs to be revised (e.g. passing in user message, not ignoring)
* refactor(prompt_management_base.py): introduce base class for prompt management
allows consistent behaviour across prompt management integrations
* feat(prompt_management_base.py): support adding client message to template message + refactor langfuse prompt management to use prompt management base
* fix(litellm_logging.py): log prompt id + prompt variables to langfuse if set
allows tracking what prompt was used for what purpose
* feat(litellm_logging.py): log prompt management metadata in standard logging payload + use in langfuse
allows logging prompt id / prompt variables to langfuse
* test: fix test
* fix(router.py): cleanup unused imports
* fix: fix linting error
* fix: fix trace param typing
* fix: fix linting errors
* fix: fix code qa check
* fix(streaming_chunk_builder_utils.py): add test for groq tool calling + streaming + combine chunks
Addresses https://github.com/BerriAI/litellm/issues/7621
* fix(streaming_utils.py): fix modelresponseiterator for openai like chunk parser
ensures chunk parser uses the correct tool call id when translating the chunk
Fixes https://github.com/BerriAI/litellm/issues/7621
* build(model_hub.tsx): display cost pricing on model hub
* build(model_hub.tsx): show cost per token pricing + complete model information
* fix(types/utils.py): fix usage object handling
* fix(types/utils.py): support langfuse + humanloop routes on llm router
* fix(main.py): remove acompletion elif block
just await if coroutine returned
* refactor(prometheus.py): refactor to remove `_tag` metrics and incorporate in regular metrics
* fix(prometheus.py): handle label values not set in enum values
* feat(prometheus.py): working e2e custom metadata labels
* docs(prometheus.md): update docs to clarify how custom metrics would work
* test(test_prometheus_unit_tests.py): fix test
* test: add unit testing
* test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model
* fix(base_llm_unit_tests.py): handle azure o1 preview response format tests
skip as o1 on azure doesn't support tool calling yet
* fix: initial commit of azure o1 handler using openai caller
simplifies calling + allows fake streaming logic alr. implemented for openai to just work
* feat(azure/o1_handler.py): fake o1 streaming for azure o1 models
azure does not currently support streaming for o1
* feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info
enables user to toggle on when azure allows o1 streaming without needing to bump versions
* style(router.py): remove 'give feedback/get help' messaging when router is used
Prevents noisy messaging
Closes https://github.com/BerriAI/litellm/issues/5942
* fix(types/utils.py): handle none logprobs
Fixes https://github.com/BerriAI/litellm/issues/328
* fix(exception_mapping_utils.py): fix error str unbound error
* refactor(azure_ai/): move to openai_like chat completion handler
allows for easy swapping of api base url's (e.g. ai.services.com)
Fixes https://github.com/BerriAI/litellm/issues/7275
* refactor(azure_ai/): move to base llm http handler
* fix(azure_ai/): handle differing api endpoints
* fix(azure_ai/): make sure all unit tests are passing
* fix: fix linting errors
* fix: fix linting errors
* fix: fix linting error
* fix: fix linting errors
* fix(azure_ai/transformation.py): handle extra body param
* fix(azure_ai/transformation.py): fix max retries param handling
* fix: fix test
* test(test_azure_o1.py): fix test
* fix(llm_http_handler.py): support handling azure ai unprocessable entity error
* fix(llm_http_handler.py): handle sync invalid param error for azure ai
* fix(azure_ai/): streaming support with base_llm_http_handler
* fix(llm_http_handler.py): working sync stream calls with unprocessable entity handling for azure ai
* fix: fix linting errors
* fix(llm_http_handler.py): fix linting error
* fix(azure_ai/): handle cohere tool call invalid index param error