* test: move test to just checking async
* fix(transformation.py): handle function call with no schema
* fix(utils.py): handle pydantic base model in message tool calls
Fix https://github.com/BerriAI/litellm/issues/9321
* fix(vertex_and_google_ai_studio.py): handle tools=[]
Fixes https://github.com/BerriAI/litellm/issues/9080
* test: remove max token restriction
* test: fix basic test
* fix(get_supported_openai_params.py): fix check
* fix(converse_transformation.py): support fake streaming for meta.llama3-3-70b-instruct-v1:0
* fix: fix test
* fix: parse out empty dictionary on dbrx streaming + tool calls
* fix(handle-'strict'-param-when-calling-fireworks-ai): fireworks ai does not support 'strict' param
* fix: fix ruff check
'
* fix: handle no strict in function
* fix: revert bedrock change - handle in separate PR
* fix(internal_user_endpoints.py): cleanup unused variables on beta endpoint
no team/org split on daily user endpoint
* build(model_prices_and_context_window.json): gemini-2.0-flash supports audio input
* feat(gemini/transformation.py): support passing audio input to gemini
* test: fix test
* fix(gemini/transformation.py): support audio input as a url
enables passing google cloud bucket urls
* fix(gemini/transformation.py): support explicitly passing format of file
* fix(gemini/transformation.py): expand support for inferred file types from url
* fix(sagemaker/completion/transformation.py): fix special token error when counting sagemaker tokens
* test: fix import
* build(pyproject.toml): add new dev dependencies - for type checking
* build: reformat files to fit black
* ci: reformat to fit black
* ci(test-litellm.yml): make tests run clear
* build(pyproject.toml): add ruff
* fix: fix ruff checks
* build(mypy/): fix mypy linting errors
* fix(hashicorp_secret_manager.py): fix passing cert for tls auth
* build(mypy/): resolve all mypy errors
* test: update test
* fix: fix black formatting
* build(pre-commit-config.yaml): use poetry run black
* fix(proxy_server.py): fix linting error
* fix: fix ruff safe representation error
* fix(vertex_and_google_ai_studio_gemini.py): log gemini audio tokens in usage object
enables accurate cost tracking
* refactor(vertex_ai/cost_calculator.py): refactor 128k+ token cost calculation to only run if model info has it
Google has moved away from this for gemini-2.0 models
* refactor(vertex_ai/cost_calculator.py): migrate to usage object for more flexible data passthrough
* fix(llm_cost_calc/utils.py): support audio token cost tracking in generic cost per token
enables vertex ai cost tracking to work with audio tokens
* fix(llm_cost_calc/utils.py): default to total prompt tokens if text tokens field not set
* refactor(llm_cost_calc/utils.py): move openai cost tracking to generic cost per token
more consistent behaviour across providers
* test: add unit test for gemini audio token cost calculation
* ci: bump ci config
* test: fix test
* Added support for top_logprobs in vertex gemini models
* Testing for top_logprobs feature in vertexai
* Update litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py
Co-authored-by: Tom Matthews <tomukmatthews@gmail.com>
* refactor(tests/): refactor testing to be in correct repo
---------
Co-authored-by: Aditya Thaker <adityathaker28@gmail.com>
Co-authored-by: Tom Matthews <tomukmatthews@gmail.com>
* fix(transformation.py): support a 'format' parameter for image's
allow user to specify mime type
* fix: pass mimetype via 'format' param
* feat(gemini/chat/transformation.py): support 'format' param for gemini
* fix(factory.py): support 'format' param on sync bedrock converse calls
* feat(bedrock/converse_transformation.py): support 'format' param for bedrock async calls
* refactor(factory.py): move to supporting 'format' param in base helper
ensures consistency in param support
* feat(gpt_transformation.py): filter out 'format' param
don't send invalid param to openai
* fix(gpt_transformation.py): fix translation
* fix: fix translation error
* fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini process url
* refactor(router.py): refactor '_prompt_management_factory' to use logging obj get_chat_completion logic
deduplicates code
* fix(litellm_logging.py): update 'get_chat_completion_prompt' to update logging object messages
* docs(prompt_management.md): update prompt management to be in beta
given feedback - this still needs to be revised (e.g. passing in user message, not ignoring)
* refactor(prompt_management_base.py): introduce base class for prompt management
allows consistent behaviour across prompt management integrations
* feat(prompt_management_base.py): support adding client message to template message + refactor langfuse prompt management to use prompt management base
* fix(litellm_logging.py): log prompt id + prompt variables to langfuse if set
allows tracking what prompt was used for what purpose
* feat(litellm_logging.py): log prompt management metadata in standard logging payload + use in langfuse
allows logging prompt id / prompt variables to langfuse
* test: fix test
* fix(router.py): cleanup unused imports
* fix: fix linting error
* fix: fix trace param typing
* fix: fix linting errors
* fix: fix code qa check
* test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model
* fix(base_llm_unit_tests.py): handle azure o1 preview response format tests
skip as o1 on azure doesn't support tool calling yet
* fix: initial commit of azure o1 handler using openai caller
simplifies calling + allows fake streaming logic alr. implemented for openai to just work
* feat(azure/o1_handler.py): fake o1 streaming for azure o1 models
azure does not currently support streaming for o1
* feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info
enables user to toggle on when azure allows o1 streaming without needing to bump versions
* style(router.py): remove 'give feedback/get help' messaging when router is used
Prevents noisy messaging
Closes https://github.com/BerriAI/litellm/issues/5942
* fix(types/utils.py): handle none logprobs
Fixes https://github.com/BerriAI/litellm/issues/328
* fix(exception_mapping_utils.py): fix error str unbound error
* refactor(azure_ai/): move to openai_like chat completion handler
allows for easy swapping of api base url's (e.g. ai.services.com)
Fixes https://github.com/BerriAI/litellm/issues/7275
* refactor(azure_ai/): move to base llm http handler
* fix(azure_ai/): handle differing api endpoints
* fix(azure_ai/): make sure all unit tests are passing
* fix: fix linting errors
* fix: fix linting errors
* fix: fix linting error
* fix: fix linting errors
* fix(azure_ai/transformation.py): handle extra body param
* fix(azure_ai/transformation.py): fix max retries param handling
* fix: fix test
* test(test_azure_o1.py): fix test
* fix(llm_http_handler.py): support handling azure ai unprocessable entity error
* fix(llm_http_handler.py): handle sync invalid param error for azure ai
* fix(azure_ai/): streaming support with base_llm_http_handler
* fix(llm_http_handler.py): working sync stream calls with unprocessable entity handling for azure ai
* fix: fix linting errors
* fix(llm_http_handler.py): fix linting error
* fix(azure_ai/): handle cohere tool call invalid index param error
* refactor(utils.py): migrate amazon titan config to base config
* refactor(utils.py): refactor bedrock meta invoke model translation to use base config
* refactor(utils.py): move bedrock ai21 to base config
* refactor(utils.py): move bedrock cohere to base config
* refactor(utils.py): move bedrock mistral to use base config
* refactor(utils.py): move all provider optional param translations to using a config
* docs(clientside_auth.md): clarify how to pass vertex region to litellm proxy
* fix(utils.py): handle scenario where custom llm provider is none / empty
* fix: fix get config
* test(test_otel_load_tests.py): widen perf margin
* fix(utils.py): fix get provider config check to handle custom llm's
* fix(utils.py): fix check
* build(model_prices_and_context_window.json): add gemini-1.5-flash context caching
* fix(context_caching/transformation.py): just use last identified cache point
Fixes https://github.com/BerriAI/litellm/issues/6738
* fix(context_caching/transformation.py): pick first contiguous block - handles system message error from google
Fixes https://github.com/BerriAI/litellm/issues/6738
* fix(vertex_ai/gemini/): track context caching tokens
* refactor(gemini/): place transformation.py inside `chat/` folder
make it easy for user to know we support the equivalent endpoint
* fix: fix import
* refactor(vertex_ai/): move vertex_ai cost calc inside vertex_ai/ folder
make it easier to see cost calculation logic
* fix: fix linting errors
* fix: fix circular import
* feat(gemini/cost_calculator.py): support gemini context caching cost calculation
generifies anthropic's cost calculation function and uses it across anthropic + gemini
* build(model_prices_and_context_window.json): add cost tracking for gemini-1.5-flash-002 w/ context caching
Closes https://github.com/BerriAI/litellm/issues/6891
* docs(gemini.md): add gemini context caching architecture diagram
make it easier for user to understand how context caching works
* docs(gemini.md): link to relevant gemini context caching code
* docs(gemini/context_caching): add readme in github, make it easy for dev to know context caching is supported + where to go for code
* fix(llm_cost_calc/utils.py): handle gemini 128k token diff cost calc scenario
* fix(deepseek/cost_calculator.py): support deepseek context caching cost calculation
* test: fix test
* fix(factory.py): skip empty text blocks for bedrock user messages
Fixes https://github.com/BerriAI/litellm/issues/7169
* Add support for Gemini 2.0 GoogleSearch tool (#7257)
* Add support for google_search tool in gemini 2.0
* Add/modify tests
* Fix grounding check
* Remove 2.0 grounding test; exclude experimental model in VERTEX_MODELS_TO_NOT_TEST
* Swap order of tools
* DFix formatting
* fix(get_api_base.py): return api base in streaming response
Fixes https://github.com/BerriAI/litellm/issues/7249
Closes https://github.com/BerriAI/litellm/pull/7250
* fix(cost_calculator.py): only set base model to model if not none
Fixes https://github.com/BerriAI/litellm/issues/7223
* fix(cost_calculator.py): enforce stricter order when picking model for cost calculation
* fix(cost_calculator.py): fix '_select_model_name_for_cost_calc' to return model name with region name prefix if provided
* fix(utils.py): fix 'get_model_info()' to handle edge case where model name starts with custom llm provider AND custom llm provider is given
* fix(cost_calculator.py): handle `custom_llm_provider-` scenario
* fix(cost_calculator.py): e2e working tts cost tracking
ensures initial message is passed in, to cost calculator
* fix(factory.py): suppress linting errors
* fix(cost_calculator.py): strip llm provider from model name after selecting cost calc model
* fix(litellm_logging.py): store initial request in 'input' field + accept base_model to be passed in litellm_params directly
* test: handle none env var value in flaky test
* fix(litellm_logging.py): fix linting errors
---------
Co-authored-by: Sam B <samlingx@gmail.com>