* build(pyproject.toml): add new dev dependencies - for type checking
* build: reformat files to fit black
* ci: reformat to fit black
* ci(test-litellm.yml): make tests run clear
* build(pyproject.toml): add ruff
* fix: fix ruff checks
* build(mypy/): fix mypy linting errors
* fix(hashicorp_secret_manager.py): fix passing cert for tls auth
* build(mypy/): resolve all mypy errors
* test: update test
* fix: fix black formatting
* build(pre-commit-config.yaml): use poetry run black
* fix(proxy_server.py): fix linting error
* fix: fix ruff safe representation error
* feat(bedrock/rerank): infer model region if model given as arn
* test: add unit testing to ensure bedrock region name inferred from arn on rerank
* feat(bedrock/rerank/transformation.py): include search units for bedrock rerank result
Resolves https://github.com/BerriAI/litellm/issues/7258#issuecomment-2671557137
* test(test_bedrock_completion.py): add testing for bedrock cohere rerank
* feat(cost_calculator.py): refactor rerank cost tracking to support bedrock cost tracking
* build(model_prices_and_context_window.json): add amazon.rerank model to model cost map
* fix(cost_calculator.py): bedrock/common_utils.py
get base model from model w/ arn -> handles rerank model
* build(model_prices_and_context_window.json): add bedrock cohere rerank pricing
* feat(bedrock/rerank): migrate bedrock config to basererank config
* Revert "feat(bedrock/rerank): migrate bedrock config to basererank config"
This reverts commit 84fae1f167.
* test: add testing to ensure large doc / queries are correctly counted
* Revert "test: add testing to ensure large doc / queries are correctly counted"
This reverts commit 4337f1657e.
* fix(migrate-jina-ai-to-rerank-config): enables cost tracking
* refactor(jina_ai/): finish migrating jina ai to base rerank config
enables cost tracking
* fix(jina_ai/rerank): e2e jina ai rerank cost tracking
* fix: cleanup dead code
* fix: fix python3.8 compatibility error
* test: fix test
* test: add e2e testing for azure ai rerank
* fix: fix linting error
* test: mark cohere as flaky
* initial transform for invoke
* invoke transform_response
* working - able to make request
* working get_complete_url
* working - invoke now runs on llm_http_handler
* fix unused imports
* track litellm overhead ms
* working stream request
* sign_request transform
* sign_request update
* use has_async_custom_stream_wrapper property
* use get_async_custom_stream_wrapper in base llm http handler
* fix make_call in invoke handler
* fix invoke with streaming get_async_custom_stream_wrapper
* working bedrock async streaming with invoke
* fix make call handler for bedrock
* test_all_model_configs
* fix test_bedrock_custom_prompt_template
* sync streaming for bedrock invoke
* fix _add_stream_param_to_request_body
* test_async_text_completion_bedrock
* fix transform_request
* fix get_supported_openai_params
* fix test supports tool choice
* fix test_supports_tool_choice
* add unit test coverage for bedrock invoke transform
* fix location of transformation files
* update import loc
* fix bedrock invoke unit tests
* fix import for max completion tokens
* build: ensure all regional bedrock models have same supported values as base bedrock model
prevents drift
* test(base_llm_unit_tests.py): add testing for nested pydantic objects
* fix(test_utils.py): add test_get_potential_model_names
* fix(anthropic/chat/transformation.py): support nested pydantic objects
Fixes https://github.com/BerriAI/litellm/issues/7755
* test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model
* fix(base_llm_unit_tests.py): handle azure o1 preview response format tests
skip as o1 on azure doesn't support tool calling yet
* fix: initial commit of azure o1 handler using openai caller
simplifies calling + allows fake streaming logic alr. implemented for openai to just work
* feat(azure/o1_handler.py): fake o1 streaming for azure o1 models
azure does not currently support streaming for o1
* feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info
enables user to toggle on when azure allows o1 streaming without needing to bump versions
* style(router.py): remove 'give feedback/get help' messaging when router is used
Prevents noisy messaging
Closes https://github.com/BerriAI/litellm/issues/5942
* fix(types/utils.py): handle none logprobs
Fixes https://github.com/BerriAI/litellm/issues/328
* fix(exception_mapping_utils.py): fix error str unbound error
* refactor(azure_ai/): move to openai_like chat completion handler
allows for easy swapping of api base url's (e.g. ai.services.com)
Fixes https://github.com/BerriAI/litellm/issues/7275
* refactor(azure_ai/): move to base llm http handler
* fix(azure_ai/): handle differing api endpoints
* fix(azure_ai/): make sure all unit tests are passing
* fix: fix linting errors
* fix: fix linting errors
* fix: fix linting error
* fix: fix linting errors
* fix(azure_ai/transformation.py): handle extra body param
* fix(azure_ai/transformation.py): fix max retries param handling
* fix: fix test
* test(test_azure_o1.py): fix test
* fix(llm_http_handler.py): support handling azure ai unprocessable entity error
* fix(llm_http_handler.py): handle sync invalid param error for azure ai
* fix(azure_ai/): streaming support with base_llm_http_handler
* fix(llm_http_handler.py): working sync stream calls with unprocessable entity handling for azure ai
* fix: fix linting errors
* fix(llm_http_handler.py): fix linting error
* fix(azure_ai/): handle cohere tool call invalid index param error
* refactor(utils.py): migrate amazon titan config to base config
* refactor(utils.py): refactor bedrock meta invoke model translation to use base config
* refactor(utils.py): move bedrock ai21 to base config
* refactor(utils.py): move bedrock cohere to base config
* refactor(utils.py): move bedrock mistral to use base config
* refactor(utils.py): move all provider optional param translations to using a config
* docs(clientside_auth.md): clarify how to pass vertex region to litellm proxy
* fix(utils.py): handle scenario where custom llm provider is none / empty
* fix: fix get config
* test(test_otel_load_tests.py): widen perf margin
* fix(utils.py): fix get provider config check to handle custom llm's
* fix(utils.py): fix check
* feat(aws_base_llm.py): prevents recreating boto3 credentials during high traffic
Leads to 100ms perf boost in local testing
* fix(base_aws_llm.py): fix credential caching check to see if token is set
* refactor(bedrock/chat): separate converse api and invoke api + isolate converse api transformation logic
Make it easier to see how requests are transformed for /converse
* fix: fix imports
* fix(bedrock/embed): fix reordering of headers
* fix(base_aws_llm.py): fix get credential logic
* fix(converse_handler.py): fix ai21 streaming response
* add max_completion_tokens
* add max_completion_tokens
* add max_completion_tokens support for OpenAI models
* add max_completion_tokens param
* add max_completion_tokens for bedrock converse models
* add test for converse maxTokens
* fix openai o1 param mapping test
* move test optional params
* add max_completion_tokens for anthropic api
* fix conftest
* add max_completion tokens for vertex ai partner models
* add max_completion_tokens for fireworks ai
* add max_completion_tokens for hf rest api
* add test for param mapping
* add param mapping for vertex, gemini + testing
* predibase is the most unstable and unusable llm api in prod, can't handle our ci/cd
* add max_completion_tokens to openai supported params
* fix fireworks ai param mapping
* refactor(bedrock): initial commit to refactor bedrock to a folder
Improve code readability + maintainability
* refactor: more refactor work
* fix: fix imports
* feat(bedrock/embeddings.py): support translating embedding into amazon embedding formats
* fix: fix linting errors
* test: skip test on end of life model
* fix(cohere/embed.py): fix linting error
* fix(cohere/embed.py): fix typing
* fix(cohere/embed.py): fix post-call logging for cohere embedding call
* test(test_embeddings.py): fix error message assertion in test