* initial commit for azure responses api support
* update get complete url
* fixes for responses API
* working azure responses API
* working responses API
* test suite for responses API
* azure responses API test suite
* fix test with complete url
* fix test refactor
* test fix metadata checks
* fix code quality check
* feat(llm_passthrough_endpoints.py): support mistral passthrough
Closes https://github.com/BerriAI/litellm/issues/9051
* feat(llm_passthrough_endpoints.py): initial commit for adding vllm passthrough route
* feat(vllm/common_utils.py): add new vllm model info route
make it possible to use vllm passthrough route via factory function
* fix(llm_passthrough_endpoints.py): add all methods to vllm passthrough route
* fix: fix linting error
* fix: fix linting error
* fix: fix ruff check
* fix(proxy/_types.py): add new passthrough routes
* docs(config_settings.md): add mistral env vars to docs
* fix(openai.py): ensure openai file object shows up on logs
* fix(managed_files.py): return unified file id as b64 str
allows retrieve file id to work as expected
* fix(managed_files.py): apply decoded file id transformation
* fix: add unit test for file id + decode logic
* fix: initial commit for litellm_proxy support with CRUD Endpoints
* fix(managed_files.py): support retrieve file operation
* fix(managed_files.py): support for DELETE endpoint for files
* fix(managed_files.py): retrieve file content support
supports retrieve file content api from openai
* fix: fix linting error
* test: update tests
* fix: fix linting error
* fix(files/main.py): pass litellm params to azure route
* test: fix test
* test: fix import for test
* fix: fix bad error string
* docs: cleanup files docs
* fix(files/main.py): cleanup error string
* style: initial commit with a provider/config pattern for files api
google ai studio files api onboarding
* fix: test
* feat(gemini/files/transformation.py): support gemini files api response transformation
* fix(gemini/files/transformation.py): return file id as gemini uri
allows id to be passed in to chat completion request, just like openai
* feat(llm_http_handler.py): support async route for files api on llm_http_handler
* fix: fix linting errors
* fix: fix model info check
* fix: fix ruff errors
* fix: fix linting errors
* Revert "fix: fix linting errors"
This reverts commit 926a5a527f.
* fix: fix linting errors
* test: fix test
* test: fix tests
* build(pyproject.toml): add new dev dependencies - for type checking
* build: reformat files to fit black
* ci: reformat to fit black
* ci(test-litellm.yml): make tests run clear
* build(pyproject.toml): add ruff
* fix: fix ruff checks
* build(mypy/): fix mypy linting errors
* fix(hashicorp_secret_manager.py): fix passing cert for tls auth
* build(mypy/): resolve all mypy errors
* test: update test
* fix: fix black formatting
* build(pre-commit-config.yaml): use poetry run black
* fix(proxy_server.py): fix linting error
* fix: fix ruff safe representation error
* refactor: introduce new transformation config for gpt-4o-transcribe models
* refactor: expose new transformation configs for audio transcription
* ci: fix config yml
* feat(openai/transcriptions): support provider config transformation on openai audio transcriptions
allows gpt-4o and whisper audio transformation to work as expected
* refactor: migrate fireworks ai + deepgram to new transform request pattern
* feat(openai/): working support for gpt-4o-audio-transcribe
* build(model_prices_and_context_window.json): add gpt-4o-transcribe to model cost map
* build(model_prices_and_context_window.json): specify what endpoints are supported for `/audio/transcriptions`
* fix(get_supported_openai_params.py): fix return
* refactor(deepgram/): migrate unit test to deepgram handler
* refactor: cleanup unused imports
* fix(get_supported_openai_params.py): fix linting error
* test: update test
* fix(vertex_and_google_ai_studio_gemini.py): log gemini audio tokens in usage object
enables accurate cost tracking
* refactor(vertex_ai/cost_calculator.py): refactor 128k+ token cost calculation to only run if model info has it
Google has moved away from this for gemini-2.0 models
* refactor(vertex_ai/cost_calculator.py): migrate to usage object for more flexible data passthrough
* fix(llm_cost_calc/utils.py): support audio token cost tracking in generic cost per token
enables vertex ai cost tracking to work with audio tokens
* fix(llm_cost_calc/utils.py): default to total prompt tokens if text tokens field not set
* refactor(llm_cost_calc/utils.py): move openai cost tracking to generic cost per token
more consistent behaviour across providers
* test: add unit test for gemini audio token cost calculation
* ci: bump ci config
* test: fix test
* fix(transformation.py): support a 'format' parameter for image's
allow user to specify mime type
* fix: pass mimetype via 'format' param
* feat(gemini/chat/transformation.py): support 'format' param for gemini
* fix(factory.py): support 'format' param on sync bedrock converse calls
* feat(bedrock/converse_transformation.py): support 'format' param for bedrock async calls
* refactor(factory.py): move to supporting 'format' param in base helper
ensures consistency in param support
* feat(gpt_transformation.py): filter out 'format' param
don't send invalid param to openai
* fix(gpt_transformation.py): fix translation
* fix: fix translation error
* fix(core_helpers.py): handle litellm_metadata instead of 'metadata'
* feat(batches/): ensure batches logs are written to db
makes batches response dict compatible
* fix(cost_calculator.py): handle batch response being a dictionary
* fix(batches/main.py): modify retrieve endpoints to use @client decorator
enables logging to work on retrieve call
* fix(batches/main.py): fix retrieve batch response type to be 'dict' compatible
* fix(spend_tracking_utils.py): send unique uuid for retrieve batch call type
create batch and retrieve batch share the same id
* fix(spend_tracking_utils.py): prevent duplicate retrieve batch calls from being double counted
* refactor(batches/): refactor cost tracking for batches - do it on retrieve, and within the established litellm_logging pipeline
ensures cost is always logged to db
* fix: fix linting errors
* fix: fix linting error
* fix(o_series_transformation.py): fix optional param check for o-series models
o3-mini and o-1 do not support parallel tool calling
* fix(utils.py): support 'drop_params' for 'thinking' param across models
allows switching to older claude versions (or non-anthropic models) and param to be safely dropped
* fix: fix passing thinking param in optional params
allows dropping thinking_param where not applicable
* test: update old model
* fix(utils.py): fix linting errors
* fix(main.py): add param to acompletion
* Fixed issue #8246 (#8250)
* Fixed issue #8246
* Added unit tests for discard() and for remove_callback_from_list_by_object()
* fix(openai.py): support dynamic passing of organization param to openai
handles scenario where client-side org id is passed to openai
---------
Co-authored-by: Erez Hadad <erezh@il.ibm.com>