* fix(azure/): support passing headers to azure openai endpoints
Fixes https://github.com/BerriAI/litellm/issues/6217
* fix(utils.py): move default tokenizer to just openai
hf tokenizer makes network calls when trying to get the tokenizer - this slows down execution time calls
* fix(router.py): fix pattern matching router - add generic "*" to it as well
Fixes issue where generic "*" model access group wouldn't show up
* fix(pattern_match_deployments.py): match to more specific pattern
match to more specific pattern
allows setting generic wildcard model access group and excluding specific models more easily
* fix(proxy_server.py): fix _delete_deployment to handle base case where db_model list is empty
don't delete all router models b/c of empty list
Fixes https://github.com/BerriAI/litellm/issues/7196
* fix(anthropic/): fix handling response_format for anthropic messages with anthropic api
* fix(fireworks_ai/): support passing response_format + tool call in same message
Addresses https://github.com/BerriAI/litellm/issues/7135
* Revert "fix(fireworks_ai/): support passing response_format + tool call in same message"
This reverts commit 6a30dc6929.
* test: fix test
* fix(replicate/): fix replicate default retry/polling logic
* test: add unit testing for router pattern matching
* test: update test to use default oai tokenizer
* test: mark flaky test
* test: skip flaky test
* add unit test for test_datadog_static_methods
* docs dd vars
* test_datadog_payload_environment_variables
* test_datadog_static_methods
* docs env vars
* fix table
* fix(acompletion): support fallbacks on acompletion
allows health checks for wildcard routes to use fallback models
* test: update cohere generate api testing
* add max tokens to health check (#7000)
* fix: fix health check test
* test: update testing
---------
Co-authored-by: Cameron <561860+wallies@users.noreply.github.com>
* refactor(fireworks_ai/): inherit from openai like base config
refactors fireworks ai to use a common config
* test: fix import in test
* refactor(watsonx/): refactor watsonx to use llm base config
refactors chat + completion routes to base config path
* fix: fix linting error
* refactor: inherit base llm config for oai compatible routes
* test: fix test
* test: fix test
* refactor(fireworks_ai/): inherit from openai like base config
refactors fireworks ai to use a common config
* test: fix import in test
* refactor(watsonx/): refactor watsonx to use llm base config
refactors chat + completion routes to base config path
* fix: fix linting error
* test: fix test
* fix: fix test
* fix use new format for Cohere config
* fix base llm http handler
* Litellm code qa common config (#7116)
* feat(base_llm): initial commit for common base config class
Addresses code qa critique https://github.com/andrewyng/aisuite/issues/113#issuecomment-2512369132
* feat(base_llm/): add transform request/response abstract methods to base config class
---------
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
* use base transform helpers
* use base_llm_http_handler for cohere
* working cohere using base llm handler
* add async cohere chat completion support on base handler
* fix completion code
* working sync cohere stream
* add async support cohere_chat
* fix types get_model_response_iterator
* async / sync tests cohere
* feat cohere using base llm class
* fix linting errors
* fix _abc error
* add cohere params to transformation
* remove old cohere file
* fix type error
* fix merge conflicts
* fix cohere merge conflicts
* fix linting error
* fix litellm.llms.custom_httpx.http_handler.HTTPHandler.post
* fix passing cohere specific params
---------
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
* feat(base_llm): initial commit for common base config class
Addresses code qa critique https://github.com/andrewyng/aisuite/issues/113#issuecomment-2512369132
* feat(base_llm/): add transform request/response abstract methods to base config class
* feat(cohere-+-clarifai): refactor integrations to use common base config class
* fix: fix linting errors
* refactor(anthropic/): move anthropic + vertex anthropic to use base config
* test: fix xai test
* test: fix tests
* fix: fix linting errors
* test: comment out WIP test
* fix(transformation.py): fix is pdf used check
* fix: fix linting error
* fix(main.py): support passing max retries to azure/openai embedding integrations
Fixes https://github.com/BerriAI/litellm/issues/7003
* feat(team_endpoints.py): allow updating team model aliases
Closes https://github.com/BerriAI/litellm/issues/6956
* feat(router.py): allow specifying model id as fallback - skips any cooldown check
Allows a default model to be checked if all models in cooldown
s/o @micahjsmith
* docs(reliability.md): add fallback to specific model to docs
* fix(utils.py): new 'is_prompt_caching_valid_prompt' helper util
Allows user to identify if messages/tools have prompt caching
Related issue: https://github.com/BerriAI/litellm/issues/6784
* feat(router.py): store model id for prompt caching valid prompt
Allows routing to that model id on subsequent requests
* fix(router.py): only cache if prompt is valid prompt caching prompt
prevents storing unnecessary items in cache
* feat(router.py): support routing prompt caching enabled models to previous deployments
Closes https://github.com/BerriAI/litellm/issues/6784
* test: fix linting errors
* feat(databricks/): convert basemodel to dict and exclude none values
allow passing pydantic message to databricks
* fix(utils.py): ensure all chat completion messages are dict
* (feat) Track `custom_llm_provider` in LiteLLMSpendLogs (#7081)
* add custom_llm_provider to SpendLogsPayload
* add custom_llm_provider to SpendLogs
* add custom llm provider to SpendLogs payload
* test_spend_logs_payload
* Add MLflow to the side bar (#7031)
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
* (bug fix) SpendLogs update DB catch all possible DB errors for retrying (#7082)
* catch DB_CONNECTION_ERROR_TYPES
* fix DB retry mechanism for SpendLog updates
* use DB_CONNECTION_ERROR_TYPES in auth checks
* fix exp back off for writing SpendLogs
* use _raise_failed_update_spend_exception to ensure errors print as NON blocking
* test_update_spend_logs_multiple_batches_with_failure
* (Feat) Add StructuredOutputs support for Fireworks.AI (#7085)
* fix model cost map fireworks ai "supports_response_schema": true,
* fix supports_response_schema
* fix map openai params fireworks ai
* test_map_response_format
* test_map_response_format
* added deepinfra/Meta-Llama-3.1-405B-Instruct (#7084)
* bump: version 1.53.9 → 1.54.0
* fix deepinfra
* litellm db fixes LiteLLM_UserTable (#7089)
* ci/cd queue new release
* fix llama-3.3-70b-versatile
* refactor - use consistent file naming convention `AI21/` -> `ai21` (#7090)
* fix refactor - use consistent file naming convention
* ci/cd run again
* fix naming structure
* fix use consistent naming (#7092)
---------
Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com>
Co-authored-by: ali sayyah <ali.sayyah2@gmail.com>