* Add date picker to usage tab + Add reasoning_content token tracking across all providers on streaming (#9722)
* feat(new_usage.tsx): add date picker for new usage tab
allow user to look back on their usage data
* feat(anthropic/chat/transformation.py): report reasoning tokens in completion token details
allows usage tracking on how many reasoning tokens are actually being used
* feat(streaming_chunk_builder.py): return reasoning_tokens in anthropic/openai streaming response
allows tracking reasoning_token usage across providers
* Fix update team metadata + fix bulk adding models on Ui (#9721)
* fix(handle_add_model_submit.tsx): fix bulk adding models
* fix(team_info.tsx): fix team metadata update
Fixes https://github.com/BerriAI/litellm/issues/9689
* (v0) Unified file id - allow calling multiple providers with same file id (#9718)
* feat(files_endpoints.py): initial commit adding 'target_model_names' support
allow developer to specify all the models they want to call with the file
* feat(files_endpoints.py): return unified files endpoint
* test(test_files_endpoints.py): add validation test - if invalid purpose submitted
* feat: more updates
* feat: initial working commit of unified file id translation
* fix: additional fixes
* fix(router.py): remove model replace logic in jsonl on acreate_file
enables file upload to work for chat completion requests as well
* fix(files_endpoints.py): remove whitespace around model name
* fix(azure/handler.py): return acreate_file with correct response type
* fix: fix linting errors
* test: fix mock test to run on github actions
* fix: fix ruff errors
* fix: fix file too large error
* fix(utils.py): remove redundant var
* test: modify test to work on github actions
* test: update tests
* test: more debug logs to understand ci/cd issue
* test: fix test for respx
* test: skip mock respx test
fails on ci/cd - not clear why
* fix: fix ruff check
* fix: fix test
* fix(model_connection_test.tsx): fix linting error
* test: update unit tests
* build(pyproject.toml): add new dev dependencies - for type checking
* build: reformat files to fit black
* ci: reformat to fit black
* ci(test-litellm.yml): make tests run clear
* build(pyproject.toml): add ruff
* fix: fix ruff checks
* build(mypy/): fix mypy linting errors
* fix(hashicorp_secret_manager.py): fix passing cert for tls auth
* build(mypy/): resolve all mypy errors
* test: update test
* fix: fix black formatting
* build(pre-commit-config.yaml): use poetry run black
* fix(proxy_server.py): fix linting error
* fix: fix ruff safe representation error
* refactor: introduce new transformation config for gpt-4o-transcribe models
* refactor: expose new transformation configs for audio transcription
* ci: fix config yml
* feat(openai/transcriptions): support provider config transformation on openai audio transcriptions
allows gpt-4o and whisper audio transformation to work as expected
* refactor: migrate fireworks ai + deepgram to new transform request pattern
* feat(openai/): working support for gpt-4o-audio-transcribe
* build(model_prices_and_context_window.json): add gpt-4o-transcribe to model cost map
* build(model_prices_and_context_window.json): specify what endpoints are supported for `/audio/transcriptions`
* fix(get_supported_openai_params.py): fix return
* refactor(deepgram/): migrate unit test to deepgram handler
* refactor: cleanup unused imports
* fix(get_supported_openai_params.py): fix linting error
* test: update test
* test(tests): add unit testing for litellm_proxy integration
* fix(cost_calculator.py): fix tracking cost in sdk when calling proxy
* fix(main.py): respect litellm.api_base on `vertex_ai/` and `gemini/` routes
* fix(main.py): consistently support custom api base across gemini + vertexai on embedding + completion
* feat(vertex_ai/): test
* fix: fix linting error
* test: set api base as None before starting loadtest
* fix(o_series_transformation.py): fix optional param check for o-series models
o3-mini and o-1 do not support parallel tool calling
* fix(utils.py): support 'drop_params' for 'thinking' param across models
allows switching to older claude versions (or non-anthropic models) and param to be safely dropped
* fix: fix passing thinking param in optional params
allows dropping thinking_param where not applicable
* test: update old model
* fix(utils.py): fix linting errors
* fix(main.py): add param to acompletion
* fix(main.py): fix key leak error when unknown provider given
don't return passed in args if unknown route on embedding
* fix(main.py): remove instances of {args} being passed in exception
prevent potential key leaks
* test(code_coverage/prevent_key_leaks_in_codebase.py): ban usage of {args} in codebase
* fix: fix linting errors
* fix: remove unused variable
* fix(azure/chat/gpt_transformation.py): add 'prediction' as a support azure param
Closes https://github.com/BerriAI/litellm/issues/8500
* build(model_prices_and_context_window.json): add new 'gemini-2.0-pro-exp-02-05' model
* style: cleanup invalid json trailing commma
* feat(utils.py): support passing 'tokenizer_config' to register_prompt_template
enables passing complete tokenizer config of model to litellm
Allows calling deepseek on bedrock with the correct prompt template
* fix(utils.py): fix register_prompt_template for custom model names
* test(test_prompt_factory.py): fix test
* test(test_completion.py): add e2e test for bedrock invoke deepseek ft model
* feat(base_invoke_transformation.py): support hf_model_name param for bedrock invoke calls
enables proxy admin to set base model for ft bedrock deepseek model
* feat(bedrock/invoke): support deepseek_r1 route for bedrock
makes it easy to apply the right chat template to that call
* feat(constants.py): store deepseek r1 chat template - allow user to get correct response from deepseek r1 without extra work
* test(test_completion.py): add e2e mock test for bedrock deepseek
* docs(bedrock.md): document new deepseek_r1 route for bedrock
allows us to use the right config
* fix(exception_mapping_utils.py): catch read operation timeout
* fix(azure.py): ensure max_retries=0 is respected
Fixes https://github.com/BerriAI/litellm/issues/6129
* fix(test_openai.py): add unit test to ensure openai sdk calls always respect max_retries = 0
* test(test_azure_openai.py): add unit testing for azure_text/ route
* fix(azure.py): fix passing max retries on streaming
* fix(azure.py): fix azure max retries on async completion + streaming
* fix(completion/handler.py): fix azure text async completion + streaming
* test(test_azure_openai.py): ensure azure openai max retries always respected
* test(test_azure_o_series.py): add testing to ensure max retries always respected
* Added gemini providers for 2.0-flash and 2.0-flash lite (#8321)
* Update model_prices_and_context_window.json
added gemini providers for 2.0-flash and 2.0-flash light
* Update model_prices_and_context_window.json
fixed URL
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* Convert tool use arguments to string before counting tokens (#6989)
In at least some cases the `messages["tool_calls"]["function"]["arguments"]` is a dict, not a string. In order to tokenize it properly it needs to be a string. In the case that it is already a string this is a noop, which is also fine.
* build(model_prices_and_context_window.json): add gemini 2.0 flash lite pricing
* build(model_prices_and_context_window.json): add gemini commercial rate limits
* fix(utils.py): fix linting error
* refactor(utils.py): refactor to maintain function size
---------
Co-authored-by: Bardia Khosravi <bardiakhosravi95@gmail.com>
Co-authored-by: Josh Morrow <josh@jcmorrow.com>
* initial transform for invoke
* invoke transform_response
* working - able to make request
* working get_complete_url
* working - invoke now runs on llm_http_handler
* fix unused imports
* track litellm overhead ms
* working stream request
* sign_request transform
* sign_request update
* use has_async_custom_stream_wrapper property
* use get_async_custom_stream_wrapper in base llm http handler
* fix make_call in invoke handler
* fix invoke with streaming get_async_custom_stream_wrapper
* working bedrock async streaming with invoke
* fix make call handler for bedrock
* test_all_model_configs
* fix test_bedrock_custom_prompt_template
* sync streaming for bedrock invoke
* fix _add_stream_param_to_request_body
* test_async_text_completion_bedrock
* fix transform_request
* fix get_supported_openai_params
* fix test supports tool choice
* fix test_supports_tool_choice
* add unit test coverage for bedrock invoke transform
* fix location of transformation files
* update import loc
* fix bedrock invoke unit tests
* fix import for max completion tokens
* refactor(deepseek/): move deepseek to base llm http handler
Fixes https://github.com/BerriAI/litellm/issues/8128#issuecomment-2635430457
* fix(gpt_transformation.py): support stream parsing for gpt-like calls
* test(test_deepseek_completion.py): add async streaming test
* fix(gpt_transformation.py): fix import
* fix(gpt_transformation.py): return full api base and content type