* test: fix import for test
* fix: fix bad error string
* docs: cleanup files docs
* fix(files/main.py): cleanup error string
* style: initial commit with a provider/config pattern for files api
google ai studio files api onboarding
* fix: test
* feat(gemini/files/transformation.py): support gemini files api response transformation
* fix(gemini/files/transformation.py): return file id as gemini uri
allows id to be passed in to chat completion request, just like openai
* feat(llm_http_handler.py): support async route for files api on llm_http_handler
* fix: fix linting errors
* fix: fix model info check
* fix: fix ruff errors
* fix: fix linting errors
* Revert "fix: fix linting errors"
This reverts commit 926a5a527f.
* fix: fix linting errors
* test: fix test
* test: fix tests
* build(pyproject.toml): add new dev dependencies - for type checking
* build: reformat files to fit black
* ci: reformat to fit black
* ci(test-litellm.yml): make tests run clear
* build(pyproject.toml): add ruff
* fix: fix ruff checks
* build(mypy/): fix mypy linting errors
* fix(hashicorp_secret_manager.py): fix passing cert for tls auth
* build(mypy/): resolve all mypy errors
* test: update test
* fix: fix black formatting
* build(pre-commit-config.yaml): use poetry run black
* fix(proxy_server.py): fix linting error
* fix: fix ruff safe representation error
* fix(anthropic/chat/transformation.py): Don't set tool choice on response_format conversion when thinking is enabled
Not allowed by Anthropic
Fixes https://github.com/BerriAI/litellm/issues/8901
* refactor: move test to base anthropic chat tests
ensures consistent behaviour across vertex/anthropic/bedrock
* fix(anthropic/chat/transformation.py): if thinking token is specified and max tokens is not - ensure max token to anthropic is higher than thinking tokens
* feat(converse_transformation.py): correctly handle thinking + response format on Bedrock Converse
Fixes https://github.com/BerriAI/litellm/issues/8901
* fix(converse_transformation.py): correctly handle adding max tokens
* test: handle service unavailable error
* fix: initial commit for adding provider model discovery to gemini
* feat(gemini/): add model discovery for gemini/ route
* docs(set_keys.md): update docs to show you can check available gemini models as well
* feat(anthropic/): add model discovery for anthropic api key
* feat(xai/): add model discovery for XAI
enables checking what models an xai key can call
* ci: bump ci config yml
* fix(topaz/common_utils.py): fix linting error
* fix: fix linting error for python38
* refactor: introduce new transformation config for gpt-4o-transcribe models
* refactor: expose new transformation configs for audio transcription
* ci: fix config yml
* feat(openai/transcriptions): support provider config transformation on openai audio transcriptions
allows gpt-4o and whisper audio transformation to work as expected
* refactor: migrate fireworks ai + deepgram to new transform request pattern
* feat(openai/): working support for gpt-4o-audio-transcribe
* build(model_prices_and_context_window.json): add gpt-4o-transcribe to model cost map
* build(model_prices_and_context_window.json): specify what endpoints are supported for `/audio/transcriptions`
* fix(get_supported_openai_params.py): fix return
* refactor(deepgram/): migrate unit test to deepgram handler
* refactor: cleanup unused imports
* fix(get_supported_openai_params.py): fix linting error
* test: update test
* fix(anthropic_claude3_transformation.py): fix amazon anthropic claude 3 tool calling transformation on invoke route
move to using anthropic config as base
* fix(utils.py): expose anthropic config via providerconfigmanager
* fix(llm_http_handler.py): support json mode on async completion calls
* fix(invoke_handler/make_call): support json mode for anthropic called via bedrock invoke
* fix(anthropic/): handle 'response_format: {"type": "text"}` + migrate amazon claude 3 invoke config to inherit from anthropic config
Prevents error when passing in 'response_format: {"type": "text"}
* test: fix test
* fix(utils.py): fix base invoke provider check
* fix(anthropic_claude3_transformation.py): don't pass 'stream' param
* fix: fix linting errors
* fix(converse_transformation.py): handle response_format type=text for converse
* feat(bedrock/rerank): infer model region if model given as arn
* test: add unit testing to ensure bedrock region name inferred from arn on rerank
* feat(bedrock/rerank/transformation.py): include search units for bedrock rerank result
Resolves https://github.com/BerriAI/litellm/issues/7258#issuecomment-2671557137
* test(test_bedrock_completion.py): add testing for bedrock cohere rerank
* feat(cost_calculator.py): refactor rerank cost tracking to support bedrock cost tracking
* build(model_prices_and_context_window.json): add amazon.rerank model to model cost map
* fix(cost_calculator.py): bedrock/common_utils.py
get base model from model w/ arn -> handles rerank model
* build(model_prices_and_context_window.json): add bedrock cohere rerank pricing
* feat(bedrock/rerank): migrate bedrock config to basererank config
* Revert "feat(bedrock/rerank): migrate bedrock config to basererank config"
This reverts commit 84fae1f167.
* test: add testing to ensure large doc / queries are correctly counted
* Revert "test: add testing to ensure large doc / queries are correctly counted"
This reverts commit 4337f1657e.
* fix(migrate-jina-ai-to-rerank-config): enables cost tracking
* refactor(jina_ai/): finish migrating jina ai to base rerank config
enables cost tracking
* fix(jina_ai/rerank): e2e jina ai rerank cost tracking
* fix: cleanup dead code
* fix: fix python3.8 compatibility error
* test: fix test
* test: add e2e testing for azure ai rerank
* fix: fix linting error
* test: mark cohere as flaky
* fix(team_endpoints.py): cleanup user <-> team association on team delete
Fixes issue where user table still listed team membership post delete
* test(test_team.py): update e2e test - ensure user/team membership is deleted on team delete
* fix(base_invoke_transformation.py): fix deepseek r1 transformation
remove deepseek name from model url
* test(test_completion.py): assert model route not in url
* feat(base_invoke_transformation.py): infer region name from model arn
prevent errors due to different region name in env var vs. model arn, respect if explicitly set in call though
* test: fix test
* test: skip on internal server error
* fix(convert_dict_to_response.py): only convert if response is the response_format tool call passed in
Fixes https://github.com/BerriAI/litellm/issues/8241
* fix(gpt_transformation.py): makes sure response format / tools conversion doesn't remove previous tool calls
* refactor(gpt_transformation.py): refactor out json schema converstion to base config
keeps logic consistent across providers
* fix(o_series_transformation.py): support o3 mini native streaming
Fixes https://github.com/BerriAI/litellm/issues/8274
* fix(gpt_transformation.py): remove unused variables
* test: update test
* initial transform for invoke
* invoke transform_response
* working - able to make request
* working get_complete_url
* working - invoke now runs on llm_http_handler
* fix unused imports
* track litellm overhead ms
* working stream request
* sign_request transform
* sign_request update
* use has_async_custom_stream_wrapper property
* use get_async_custom_stream_wrapper in base llm http handler
* fix make_call in invoke handler
* fix invoke with streaming get_async_custom_stream_wrapper
* working bedrock async streaming with invoke
* fix make call handler for bedrock
* test_all_model_configs
* fix test_bedrock_custom_prompt_template
* sync streaming for bedrock invoke
* fix _add_stream_param_to_request_body
* test_async_text_completion_bedrock
* fix transform_request
* fix get_supported_openai_params
* fix test supports tool choice
* fix test_supports_tool_choice
* add unit test coverage for bedrock invoke transform
* fix location of transformation files
* update import loc
* fix bedrock invoke unit tests
* fix import for max completion tokens
* fix(o_series_transformation.py): add 'reasoning_effort' as o series model param
Closes https://github.com/BerriAI/litellm/issues/8182
* fix(main.py): ensure `reasoning_effort` is a mapped openai param
* refactor(azure/): rename o1_[x] files to o_series_[x]
* refactor(base_llm_unit_tests.py): refactor testing for o series reasoning effort
* test(test_azure_o_series.py): have azure o series tests correctly inherit from base o series model tests
* feat(base_utils.py): support translating 'developer' role to 'system' role for non-openai providers
Makes it easy to switch from openai to anthropic
* fix: fix linting errors
* fix(base_llm_unit_tests.py): fix test
* fix(main.py): add missing param
* fix(utils.py): initial commit fixing custom cost tracking
refactors out provider specific model info from `get_model_info` - this was causing custom costs to be registered incorrectly
* fix(utils.py): cleanup `_supports_factory` to check provider info, if model info is None
some providers support features like vision across all models
* fix(utils.py): refactor to use _supports_factory
* test: update testing
* fix: fix linting errors
* test: fix testing
* fix(base_utils.py): supported nested json schema passed in for anthropic calls
* refactor(base_utils.py): refactor ref parsing to prevent infinite loop
* test(test_openai_endpoints.py): refactor anthropic test to use bedrock
* fix(langfuse_prompt_management.py): add unit test for sync langfuse calls
Resolves https://github.com/BerriAI/litellm/issues/7938#issuecomment-2613293757
* build: ensure all regional bedrock models have same supported values as base bedrock model
prevents drift
* test(base_llm_unit_tests.py): add testing for nested pydantic objects
* fix(test_utils.py): add test_get_potential_model_names
* fix(anthropic/chat/transformation.py): support nested pydantic objects
Fixes https://github.com/BerriAI/litellm/issues/7755
* feat(main.py): initial commit for `/image/variations` endpoint support
* refactor(base_llm/): introduce new base llm base config for image variation endpoints
* refactor(openai/image_variations/transformation.py): implement openai image variation transformation handler
* fix: test
* feat(openai/): working openai `/image/variation` endpoint calls via sdk
* feat(topaz/): topaz sync image variation call support
Addresses https://github.com/BerriAI/litellm/issues/7593
'
* fix(topaz/transformation.py): fix linting errors
* fix(openai/image_variations/handler.py): fix passing json data
* fix(main.py): image_variation/
support async image variation route - `aimage_variation`
* fix(test_get_model_info.py): fix test
* fix: cleanup unused imports
* feat(openai/): add async `/image/variations` endpoint support
* feat(topaz/): support async `/image/variations` calls
* fix: test
* fix(utils.py): fix get_model_info_helper for no model info w/ provider config
handles situation where model info is not known but provider config exists
* test(test_router_fallbacks.py): mark flaky test
* fix: fix unused imports
* test: bump otel load test perf threshold - accounts for current load tests hitting same server
* test(test_utils.py): initial test for valid models
Addresses https://github.com/BerriAI/litellm/issues/7525
* fix: test
* feat(fireworks_ai/transformation.py): support retrieving valid models from fireworks ai endpoint
* refactor(fireworks_ai/): support checking model info on `/v1/models` route
* docs(set_keys.md): update docs to clarify check llm provider api usage
* fix(watsonx/common_utils.py): support 'WATSONX_ZENAPIKEY' for iam auth
* fix(watsonx): read in watsonx token from env var
* fix: fix linting errors
* fix(utils.py): fix provider config check
* style: cleanup unused imports
* test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model
* fix(base_llm_unit_tests.py): handle azure o1 preview response format tests
skip as o1 on azure doesn't support tool calling yet
* fix: initial commit of azure o1 handler using openai caller
simplifies calling + allows fake streaming logic alr. implemented for openai to just work
* feat(azure/o1_handler.py): fake o1 streaming for azure o1 models
azure does not currently support streaming for o1
* feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info
enables user to toggle on when azure allows o1 streaming without needing to bump versions
* style(router.py): remove 'give feedback/get help' messaging when router is used
Prevents noisy messaging
Closes https://github.com/BerriAI/litellm/issues/5942
* fix(types/utils.py): handle none logprobs
Fixes https://github.com/BerriAI/litellm/issues/328
* fix(exception_mapping_utils.py): fix error str unbound error
* refactor(azure_ai/): move to openai_like chat completion handler
allows for easy swapping of api base url's (e.g. ai.services.com)
Fixes https://github.com/BerriAI/litellm/issues/7275
* refactor(azure_ai/): move to base llm http handler
* fix(azure_ai/): handle differing api endpoints
* fix(azure_ai/): make sure all unit tests are passing
* fix: fix linting errors
* fix: fix linting errors
* fix: fix linting error
* fix: fix linting errors
* fix(azure_ai/transformation.py): handle extra body param
* fix(azure_ai/transformation.py): fix max retries param handling
* fix: fix test
* test(test_azure_o1.py): fix test
* fix(llm_http_handler.py): support handling azure ai unprocessable entity error
* fix(llm_http_handler.py): handle sync invalid param error for azure ai
* fix(azure_ai/): streaming support with base_llm_http_handler
* fix(llm_http_handler.py): working sync stream calls with unprocessable entity handling for azure ai
* fix: fix linting errors
* fix(llm_http_handler.py): fix linting error
* fix(azure_ai/): handle cohere tool call invalid index param error
* docs(sidebar.js): docs for support model access groups for wildcard routes
* feat(key_management_endpoints.py): add check if user is premium_user when adding model access group for wildcard route
* refactor(docs/): make control model access a root-level doc in proxy sidebar
easier to discover how to control model access on litellm
* docs: more cleanup
* feat(fireworks_ai/): add document inlining support
Enables user to call non-vision models with images/pdfs/etc.
* test(test_fireworks_ai_translation.py): add unit testing for fireworks ai transform inline helper util
* docs(docs/): add document inlining details to fireworks ai docs
* feat(fireworks_ai/): allow user to dynamically disable auto add transform inline
allows client-side disabling of this feature for proxy users
* feat(fireworks_ai/): return 'supports_vision' and 'supports_pdf_input' true on all fireworks ai models
now true as fireworks ai supports document inlining
* test: fix tests
* fix(router.py): add unit testing for _is_model_access_group_for_wildcard_route
* test: add new test image embedding to base llm unit tests
Addresses https://github.com/BerriAI/litellm/issues/6515
* fix(bedrock/embed/multimodal-embeddings): strip data prefix from image urls for bedrock multimodal embeddings
Fix https://github.com/BerriAI/litellm/issues/6515
* feat: initial commit for fireworks ai audio transcription support
Relevant issue: https://github.com/BerriAI/litellm/issues/7134
* test: initial fireworks ai test
* feat(fireworks_ai/): implemented fireworks ai audio transcription config
* fix(utils.py): register fireworks ai audio transcription config, in config manager
* fix(utils.py): add fireworks ai param translation to 'get_optional_params_transcription'
* refactor(fireworks_ai/): define text completion route with model name handling
moves model name handling to specific fireworks routes, as required by their api
* refactor(fireworks_ai/chat): define transform_Request - allows fixing model if accounts/ is missing
* fix: fix linting errors
* fix: fix linting errors
* fix: fix linting errors
* fix: fix linting errors
* fix(handler.py): fix linting errors
* fix(main.py): fix tgai text completion route
* refactor(together_ai/completion): refactors together ai text completion route to just use provider transform request
* refactor: move test_fine_tuning_api out of local_testing
reduces local testing ci/cd time