* feat(aws_base_llm.py): prevents recreating boto3 credentials during high traffic
Leads to 100ms perf boost in local testing
* fix(base_aws_llm.py): fix credential caching check to see if token is set
* refactor(bedrock/chat): separate converse api and invoke api + isolate converse api transformation logic
Make it easier to see how requests are transformed for /converse
* fix: fix imports
* fix(bedrock/embed): fix reordering of headers
* fix(base_aws_llm.py): fix get credential logic
* fix(converse_handler.py): fix ai21 streaming response
* fix(health_check.py): hide sensitive keys from health check debug information k
* fix(route_llm_request.py): fix proxy model not found error message to indicate how to resolve issue
* fix(vertex_llm_base.py): fix exception message to not log credentials
* fix get model list
* fix test custom callback router
* fix embedding fallback test
* fix router retry policy on AuthErrors
* fix router test
* add test for single deployments no cooldown test prod
* add test test_single_deployment_no_cooldowns_test_prod_mock_completion_calls
* fix(proxy/utils.py): comment out auto-upsert logic in check_view_exists
Prevents proxy from failing on startup due to faulty logic
* fix(db/migration_scripts/create_views.py): fix 'DailyTagSpend' quotation on check
* fix(create_views.py): mongly global spend time period should be 30d not 20d
* fix(schema.prisma): index on startTime and endUser for efficient UI querying
* fix(caching.py): set ttl for async_increment cache
fixes issue where ttl for redis client was not being set on increment_cache
Fixes https://github.com/BerriAI/litellm/issues/5609
* fix(caching.py): fix increment cache w/ ttl for sync increment cache on redis
Fixes https://github.com/BerriAI/litellm/issues/5609
* fix(router.py): support adding retry policy + allowed fails policy via config.yaml
* fix(router.py): don't cooldown single deployments
No point, as there's no other deployment to loadbalance with.
* fix(user_api_key_auth.py): support setting allowed email domains on jwt tokens
Closes https://github.com/BerriAI/litellm/issues/5605
* docs(token_auth.md): add user upsert + allowed email domain to jwt auth docs
* fix(litellm_pre_call_utils.py): fix dynamic key logging when team id is set
Fixes issue where key logging would not be set if team metadata was not none
* fix(secret_managers/main.py): load environment variables correctly
Fixes issue where os.environ/ was not being loaded correctly
* test(test_router.py): fix test
* feat(spend_tracking_utils.py): support logging additional usage params - e.g. prompt caching values for deepseek
* test: fix tests
* test: fix test
* test: fix test
* test: fix test
* test: fix test
* fix(cost_calculator.py): move to debug for noisy warning message on cost calculation error
Fixes https://github.com/BerriAI/litellm/issues/5610
* fix(databricks/cost_calculator.py): Handles model name issues for databricks models
* fix(main.py): fix stream chunk builder for multiple tool calls
Fixes https://github.com/BerriAI/litellm/issues/5591
* fix: correctly set user_alias when passed in
Fixes https://github.com/BerriAI/litellm/issues/5612
* fix(types/utils.py): allow passing role for message object
https://github.com/BerriAI/litellm/issues/5621
* fix(litellm_logging.py): Fix langfuse logging across multiple projects
Fixes issue where langfuse logger was re-using the old logging object
* feat(proxy/_types.py): support adding key-based tags for tag-based routing
Enable tag based routing at key-level
* fix(proxy/_types.py): fix inheritance
* test(test_key_generate_prisma.py): fix test
* test: fix test
* fix(litellm_logging.py): return used callback object
* fix(main.py): pass default azure api version as alternative in completion call
Fixes api error caused due to api version
Closes https://github.com/BerriAI/litellm/issues/5584
* Fixed gemini-1.5-flash pricing (#5590)
* add /key/list endpoint
* bump: version 1.44.21 → 1.44.22
* docs architecture
* Fixed gemini-1.5-flash pricing
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
* fix(bedrock/chat.py): fix converse api stop sequence param mapping
Fixes https://github.com/BerriAI/litellm/issues/5592
* fix(databricks/cost_calculator.py): handle databricks model name changes
Fixes https://github.com/BerriAI/litellm/issues/5597
* fix(azure.py): support azure api version 2024-08-01-preview
Closes https://github.com/BerriAI/litellm/issues/5377
* fix(proxy/_types.py): allow dev keys to call cohere /rerank endpoint
Fixes issue where only admin could call rerank endpoint
* fix(azure.py): check if model is gpt-4o
* fix(proxy/_types.py): support /v1/rerank on non-admin routes as well
* fix(cost_calculator.py): fix split on `/` logic in cost calculator
---------
Co-authored-by: F1bos <44951186+F1bos@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
* feat: initial commit
* fix(proxy/auth/auth_utils.py): Allow client-side credentials to be given to the proxy (accept only if complete credentials are given)