* build(model_prices_and_context_window.json): add vertex ai gemini-2.5-flash pricing
* build(model_prices_and_context_window.json): add gemini reasoning token pricing
* fix(vertex_and_google_ai_studio_gemini.py): support counting thinking tokens for gemini
allows accurate cost calc
* fix(utils.py): add reasoning token cost calc to generic cost calc
ensures gemini-2.5-flash cost calculation is accurate
* build(model_prices_and_context_window.json): mark gemini-2.5-flash as 'supports_reasoning'
* feat(gemini/): support 'thinking' + 'reasoning_effort' params + new unit tests
allow controlling thinking effort for gemini-2.5-flash models
* test: update unit testing
* feat(vertex_and_google_ai_studio_gemini.py): return reasoning content if given in gemini response
* test: update model name
* fix: fix ruff check
* test(test_spend_management_endpoints.py): update tests to be less sensitive to new keys / updates to usage object
* fix(vertex_and_google_ai_studio_gemini.py): fix translation
* initial commit for azure responses api support
* update get complete url
* fixes for responses API
* working azure responses API
* working responses API
* test suite for responses API
* azure responses API test suite
* fix test with complete url
* fix test refactor
* test fix metadata checks
* fix code quality check
* feat(utils.py): support global flag for 'check_provider_endpoints'
enables setting this for `/models` on proxy
* feat(utils.py): add caching to 'get_valid_models'
Prevents checking endpoint repeatedly
* fix(utils.py): ensure mutations don't impact cached results
* test(test_utils.py): add unit test to confirm cache invalidation logic
* feat(utils.py): get_valid_models - support passing litellm params dynamically
Allows for checking endpoints based on received credentials
* test: update test
* feat(model_checks.py): pass router credentials to get_valid_models - ensures it checks correct credentials
* refactor(utils.py): refactor for simpler functions
* fix: fix linting errors
* fix(utils.py): fix test
* fix(utils.py): set valid providers to custom_llm_provider, if given
* test: update test
* fix: fix ruff check error
* feat(llm_passthrough_endpoints.py): support mistral passthrough
Closes https://github.com/BerriAI/litellm/issues/9051
* feat(llm_passthrough_endpoints.py): initial commit for adding vllm passthrough route
* feat(vllm/common_utils.py): add new vllm model info route
make it possible to use vllm passthrough route via factory function
* fix(llm_passthrough_endpoints.py): add all methods to vllm passthrough route
* fix: fix linting error
* fix: fix linting error
* fix: fix ruff check
* fix(proxy/_types.py): add new passthrough routes
* docs(config_settings.md): add mistral env vars to docs
* fix(cost_calculator.py): handle custom pricing at deployment level for router
* test: add unit tests
* fix(router.py): show custom pricing on UI
check correct model str
* fix: fix linting error
* docs(custom_pricing.md): clarify custom pricing for proxy
Fixes https://github.com/BerriAI/litellm/issues/8573#issuecomment-2790420740
* test: update code qa test
* fix: cleanup traceback
* fix: handle litellm param custom pricing
* test: update test
* fix(cost_calculator.py): add router model id to list of potential model names
* fix(cost_calculator.py): fix router model id check
* fix: router.py - maintain older model registry approach
* fix: fix ruff check
* fix(router.py): router get deployment info
add custom values to mapped dict
* test: update test
* fix(utils.py): update only if value is non-null
* test: add unit test
* fix(litellm_proxy/chat/transformation.py): support 'thinking' param
Fixes https://github.com/BerriAI/litellm/issues/9380
* feat(azure/gpt_transformation.py): add azure audio model support
Closes https://github.com/BerriAI/litellm/issues/6305
* fix(utils.py): use provider_config in common functions
* fix(utils.py): add missing provider configs to get_chat_provider_config
* test: fix test
* fix: fix path
* feat(utils.py): make bedrock invoke nova config baseconfig compatible
* fix: fix linting errors
* fix(azure_ai/transformation.py): remove buggy optional param filtering for azure ai
Removes incorrect check for support tool choice when calling azure ai - prevented calling models with response_format unless on litell model cost map
* fix(amazon_cohere_transformation.py): fix bedrock invoke cohere transformation to inherit from coherechatconfig
* test: fix azure ai tool choice mapping
* fix: fix model cost map to add 'supports_tool_choice' to cohere models
* fix(get_supported_openai_params.py): check if custom llm provider in llm providers
* fix(get_supported_openai_params.py): fix llm provider in list check
* fix: fix ruff check errors
* fix: support defs when calling bedrock nova
* fix(factory.py): fix test
* test: move test to just checking async
* fix(transformation.py): handle function call with no schema
* fix(utils.py): handle pydantic base model in message tool calls
Fix https://github.com/BerriAI/litellm/issues/9321
* fix(vertex_and_google_ai_studio.py): handle tools=[]
Fixes https://github.com/BerriAI/litellm/issues/9080
* test: remove max token restriction
* test: fix basic test
* fix(get_supported_openai_params.py): fix check
* fix(converse_transformation.py): support fake streaming for meta.llama3-3-70b-instruct-v1:0
* fix: fix test
* fix: parse out empty dictionary on dbrx streaming + tool calls
* fix(handle-'strict'-param-when-calling-fireworks-ai): fireworks ai does not support 'strict' param
* fix: fix ruff check
'
* fix: handle no strict in function
* fix: revert bedrock change - handle in separate PR
* test: fix import for test
* fix: fix bad error string
* docs: cleanup files docs
* fix(files/main.py): cleanup error string
* style: initial commit with a provider/config pattern for files api
google ai studio files api onboarding
* fix: test
* feat(gemini/files/transformation.py): support gemini files api response transformation
* fix(gemini/files/transformation.py): return file id as gemini uri
allows id to be passed in to chat completion request, just like openai
* feat(llm_http_handler.py): support async route for files api on llm_http_handler
* fix: fix linting errors
* fix: fix model info check
* fix: fix ruff errors
* fix: fix linting errors
* Revert "fix: fix linting errors"
This reverts commit 926a5a527f.
* fix: fix linting errors
* test: fix test
* test: fix tests
* build(pyproject.toml): add new dev dependencies - for type checking
* build: reformat files to fit black
* ci: reformat to fit black
* ci(test-litellm.yml): make tests run clear
* build(pyproject.toml): add ruff
* fix: fix ruff checks
* build(mypy/): fix mypy linting errors
* fix(hashicorp_secret_manager.py): fix passing cert for tls auth
* build(mypy/): resolve all mypy errors
* test: update test
* fix: fix black formatting
* build(pre-commit-config.yaml): use poetry run black
* fix(proxy_server.py): fix linting error
* fix: fix ruff safe representation error
* fix: initial commit for adding provider model discovery to gemini
* feat(gemini/): add model discovery for gemini/ route
* docs(set_keys.md): update docs to show you can check available gemini models as well
* feat(anthropic/): add model discovery for anthropic api key
* feat(xai/): add model discovery for XAI
enables checking what models an xai key can call
* ci: bump ci config yml
* fix(topaz/common_utils.py): fix linting error
* fix: fix linting error for python38
* refactor: introduce new transformation config for gpt-4o-transcribe models
* refactor: expose new transformation configs for audio transcription
* ci: fix config yml
* feat(openai/transcriptions): support provider config transformation on openai audio transcriptions
allows gpt-4o and whisper audio transformation to work as expected
* refactor: migrate fireworks ai + deepgram to new transform request pattern
* feat(openai/): working support for gpt-4o-audio-transcribe
* build(model_prices_and_context_window.json): add gpt-4o-transcribe to model cost map
* build(model_prices_and_context_window.json): specify what endpoints are supported for `/audio/transcriptions`
* fix(get_supported_openai_params.py): fix return
* refactor(deepgram/): migrate unit test to deepgram handler
* refactor: cleanup unused imports
* fix(get_supported_openai_params.py): fix linting error
* test: update test
* feat(batches/): fix batch cost calculation - ensure it's accurate
use the correct cost value - prev. defaulting to non-batch cost
* feat(batch_utils.py): log batch models to spend logs + standard logging payload
makes it easy to understand how cost was calculated
* fix: fix stored payload for test
* test: fix test
* feat: initial commit - enable dev to see translated request
* feat(utils.py): expose new endpoint - `/utils/transform_request` to see the raw request sent by litellm
* feat(transform_request.tsx): allow user to see their transformed request
* refactor(litellm_logging.py): return raw request in 3 parts - api_base, headers, request body
easier to render each individually on UI vs. extracting from combined string
* feat: transform_request.tsx
working e2e raw request viewing
* fix(litellm_logging.py): fix transform viewing for bedrock models
* fix(litellm_logging.py): don't return sensitive headers in raw request headers
prevent accidental leak
* feat(transform_request.tsx): style improvements
* fix(anthropic_claude3_transformation.py): fix amazon anthropic claude 3 tool calling transformation on invoke route
move to using anthropic config as base
* fix(utils.py): expose anthropic config via providerconfigmanager
* fix(llm_http_handler.py): support json mode on async completion calls
* fix(invoke_handler/make_call): support json mode for anthropic called via bedrock invoke
* fix(anthropic/): handle 'response_format: {"type": "text"}` + migrate amazon claude 3 invoke config to inherit from anthropic config
Prevents error when passing in 'response_format: {"type": "text"}
* test: fix test
* fix(utils.py): fix base invoke provider check
* fix(anthropic_claude3_transformation.py): don't pass 'stream' param
* fix: fix linting errors
* fix(converse_transformation.py): handle response_format type=text for converse
* fix(o_series_transformation.py): fix optional param check for o-series models
o3-mini and o-1 do not support parallel tool calling
* fix(utils.py): support 'drop_params' for 'thinking' param across models
allows switching to older claude versions (or non-anthropic models) and param to be safely dropped
* fix: fix passing thinking param in optional params
allows dropping thinking_param where not applicable
* test: update old model
* fix(utils.py): fix linting errors
* fix(main.py): add param to acompletion