* add max_completion_tokens
* add max_completion_tokens
* add max_completion_tokens support for OpenAI models
* add max_completion_tokens param
* add max_completion_tokens for bedrock converse models
* add test for converse maxTokens
* fix openai o1 param mapping test
* move test optional params
* add max_completion_tokens for anthropic api
* fix conftest
* add max_completion tokens for vertex ai partner models
* add max_completion_tokens for fireworks ai
* add max_completion_tokens for hf rest api
* add test for param mapping
* add param mapping for vertex, gemini + testing
* predibase is the most unstable and unusable llm api in prod, can't handle our ci/cd
* add max_completion_tokens to openai supported params
* fix fireworks ai param mapping
* fix(utils.py): return citations for perplexity streaming
Fixes https://github.com/BerriAI/litellm/issues/5535
* fix(anthropic/chat.py): support fallbacks for anthropic streaming (#5542)
* fix(anthropic/chat.py): support fallbacks for anthropic streaming
Fixes https://github.com/BerriAI/litellm/issues/5512
* fix(anthropic/chat.py): use module level http client if none given (prevents early client closure)
* fix: fix linting errors
* fix(http_handler.py): fix raise_for_status error handling
* test: retry flaky test
* fix otel type
* fix(bedrock/embed): fix error raising
* test(test_openai_batches_and_files.py): skip azure batches test (for now) quota exceeded
* fix(test_router.py): skip azure batch route test (for now) - hit batch quota limits
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
* All `model_group_alias` should show up in `/models`, `/model/info` , `/model_group/info` (#5539)
* fix(router.py): support returning model_alias model names in `/v1/models`
* fix(proxy_server.py): support returning model alias'es on `/model/info`
* feat(router.py): support returning model group alias for `/model_group/info`
* fix(proxy_server.py): fix linting errors
* fix(proxy_server.py): fix linting errors
* build(model_prices_and_context_window.json): add amazon titan text premier pricing information
Closes https://github.com/BerriAI/litellm/issues/5560
* feat(litellm_logging.py): log standard logging response object for pass through endpoints. Allows bedrock /invoke agent calls to be correctly logged to langfuse + s3
* fix(success_handler.py): fix linting error
* fix(success_handler.py): fix linting errors
* fix(team_endpoints.py): Allows admin to update team member budgets
---------
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
* Minor IAM AWS OIDC Improvements (#5246)
* AWS IAM: Temporary tokens are valid across all regions after being issued, so it is wasteful to request one for each region.
* AWS IAM: Include an inline policy, to help reduce misuse of overly permissive IAM roles.
* (test_bedrock_completion.py): Ensure we are testing cross AWS region OIDC flow.
* fix(router.py): log rejected requests
Fixes https://github.com/BerriAI/litellm/issues/5498
* refactor: don't use verbose_logger.exception, if exception is raised
User might already have handling for this. But alerting systems in prod will raise this as an unhandled error.
* fix(datadog.py): support setting datadog source as an env var
Fixes https://github.com/BerriAI/litellm/issues/5508
* docs(logging.md): add dd_source to datadog docs
* fix(proxy_server.py): expose `/customer/list` endpoint for showing all customers
* (bedrock): Fix usage with Cloudflare AI Gateway, and proxies in general. (#5509)
* feat(anthropic.py): support 'cache_control' param for content when it is a string
* Revert "(bedrock): Fix usage with Cloudflare AI Gateway, and proxies in gener…" (#5519)
This reverts commit 3fac0349c2.
* refactor: ci/cd run again
---------
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
* feat(router.py): initial commit for loadbalancing azure batch api endpoints
Closes https://github.com/BerriAI/litellm/issues/5396
* fix(router.py): working `router.acreate_file()`
* feat(router.py): working router.acreate_batch endpoint
* feat(router.py): expose router.aretrieve_batch function
Make it easy for user to retrieve the batch information
* feat(router.py): support 'router.alist_batches' endpoint
Adds support for getting all batches across all endpoints
* feat(router.py): working loadbalancing on `/v1/files`
* feat(proxy_server.py): working loadbalancing on `/v1/batches`
* feat(proxy_server.py): working loadbalancing on Retrieve + List batch
* feat(proxy/_types.py): add lago billing to callbacks ui
Closes https://github.com/BerriAI/litellm/issues/5472
* fix(anthropic.py): return anthropic prompt caching information
Fixes https://github.com/BerriAI/litellm/issues/5364
* feat(bedrock/chat.py): support 'json_schema' for bedrock models
Closes https://github.com/BerriAI/litellm/issues/5434
* fix(bedrock/embed/embeddings.py): support async embeddings for amazon titan models
* fix: linting fixes
* fix: handle key errors
* fix(bedrock/chat.py): fix bedrock ai21 streaming object
* feat(bedrock/embed): support bedrock embedding optional params
* fix(databricks.py): fix usage chunk
* fix(internal_user_endpoints.py): apply internal user defaults, if user role updated
Fixes issue where user update wouldn't apply defaults
* feat(slack_alerting.py): provide multiple slack channels for a given alert type
multiple channels might be interested in receiving an alert for a given type
* docs(alerting.md): add multiple channel alerting to docs
* fix(vertex_endpoints.py): fix vertex ai pass through endpoints
* test(test_streaming.py): skip model due to end of life
* feat(custom_logger.py): add special callback for model hitting tpm/rpm limits
Closes https://github.com/BerriAI/litellm/issues/4096
* refactor(bedrock): initial commit to refactor bedrock to a folder
Improve code readability + maintainability
* refactor: more refactor work
* fix: fix imports
* feat(bedrock/embeddings.py): support translating embedding into amazon embedding formats
* fix: fix linting errors
* test: skip test on end of life model
* fix(cohere/embed.py): fix linting error
* fix(cohere/embed.py): fix typing
* fix(cohere/embed.py): fix post-call logging for cohere embedding call
* test(test_embeddings.py): fix error message assertion in test
* fix(utils.py): support 'drop_params' for embedding requests
Fixes https://github.com/BerriAI/litellm/issues/5444
* feat(anthropic/cost_calculation.py): Support calculating cost for prompt caching on anthropic
* feat(types/utils.py): allows us to migrate to openai's equivalent, once that comes out
* fix: fix linting errors
* test: mark flaky test
Improved the clarity of Exceptions in supports_system_messages, supports_response_schema, supports_function_calling, and supports_parallel_function_calling. Previously, it was difficult to determine the cause of Exception logs due to vague messaging. Each case now includes a more specific and appropriate Exception message.