* fix(invoke_handler.py): fix converse streaming - return signature + ensure consistency with anthropic api response
* build(model_prices_and_context_window.json): fix anthropic api claude-3-7 max output tokens
with beta header this is 128k
Resolves https://github.com/BerriAI/litellm/issues/8964
* feat(handler.py): handle new anthropic 'thinking_delta' block on streaming
Fixes https://github.com/BerriAI/litellm/issues/8825
* fix(transformation.py): support a 'format' parameter for image's
allow user to specify mime type
* fix: pass mimetype via 'format' param
* feat(gemini/chat/transformation.py): support 'format' param for gemini
* fix(factory.py): support 'format' param on sync bedrock converse calls
* feat(bedrock/converse_transformation.py): support 'format' param for bedrock async calls
* refactor(factory.py): move to supporting 'format' param in base helper
ensures consistency in param support
* feat(gpt_transformation.py): filter out 'format' param
don't send invalid param to openai
* fix(gpt_transformation.py): fix translation
* fix: fix translation error
* Fix missing signature_delta in thinking blocks when streaming from Claude 3.7 (#8797)
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* test: update test to enforce signature found
* feat(refactor-signature-param-to-be-'signature'-instead-of-'signature_delta'): keeps it in sync with anthropic
* fix: fix linting error
---------
Co-authored-by: Martin Krasser <krasserm@googlemail.com>
* fix(core_helpers.py): handle litellm_metadata instead of 'metadata'
* feat(batches/): ensure batches logs are written to db
makes batches response dict compatible
* fix(cost_calculator.py): handle batch response being a dictionary
* fix(batches/main.py): modify retrieve endpoints to use @client decorator
enables logging to work on retrieve call
* fix(batches/main.py): fix retrieve batch response type to be 'dict' compatible
* fix(spend_tracking_utils.py): send unique uuid for retrieve batch call type
create batch and retrieve batch share the same id
* fix(spend_tracking_utils.py): prevent duplicate retrieve batch calls from being double counted
* refactor(batches/): refactor cost tracking for batches - do it on retrieve, and within the established litellm_logging pipeline
ensures cost is always logged to db
* fix: fix linting errors
* fix: fix linting error
* fix(common_utils.py): handle $id in response schema when calling vertex ai
Fixes issue where `$id` present in response_schema was not accepted by vertex ai
* test(test_vertex.py): add unit test to ensure $id stripped out of vertex schema
* fix(anthropic_claude3_transformation.py): fix amazon anthropic claude 3 tool calling transformation on invoke route
move to using anthropic config as base
* fix(utils.py): expose anthropic config via providerconfigmanager
* fix(llm_http_handler.py): support json mode on async completion calls
* fix(invoke_handler/make_call): support json mode for anthropic called via bedrock invoke
* fix(anthropic/): handle 'response_format: {"type": "text"}` + migrate amazon claude 3 invoke config to inherit from anthropic config
Prevents error when passing in 'response_format: {"type": "text"}
* test: fix test
* fix(utils.py): fix base invoke provider check
* fix(anthropic_claude3_transformation.py): don't pass 'stream' param
* fix: fix linting errors
* fix(converse_transformation.py): handle response_format type=text for converse
* fix(http_parsing_utils.py): orjson can throw errors on some emoji's in text, default to json.loads
* fix(sagemaker/handler.py): support passing model id on async streaming
* fix(litellm_pre_call_utils.py): Fixes https://github.com/BerriAI/litellm/issues/7237
* feat(bedrock/converse/transformation.py): support claude-3-7-sonnet reasoning_Content transformation
Closes https://github.com/BerriAI/litellm/issues/8777
* fix(bedrock/): support returning `reasoning_content` on streaming for claude-3-7
Resolves https://github.com/BerriAI/litellm/issues/8777
* feat(bedrock/): unify converse reasoning content blocks for consistency across anthropic and bedrock
* fix(anthropic/chat/transformation.py): handle deepseek-style 'reasoning_content' extraction within transformation.py
simpler logic
* feat(bedrock/): fix streaming to return blocks in consistent format
* fix: fix linting error
* test: fix test
* feat(factory.py): fix bedrock thinking block translation on tool calling
allows passing the thinking blocks back to bedrock for tool calling
* fix(types/utils.py): don't exclude provider_specific_fields on model dump
ensures consistent responses
* fix: fix linting errors
* fix(convert_dict_to_response.py): pass reasoning_content on root
* fix: test
* fix(streaming_handler.py): add helper util for setting model id
* fix(streaming_handler.py): fix setting model id on model response stream chunk
* fix(streaming_handler.py): fix linting error
* fix(streaming_handler.py): fix linting error
* fix(types/utils.py): add provider_specific_fields to model stream response
* fix(streaming_handler.py): copy provider specific fields and add them to the root of the streaming response
* fix(streaming_handler.py): fix check
* fix: fix test
* fix(types/utils.py): ensure messages content is always openai compatible
* fix(types/utils.py): fix delta object to always be openai compatible
only introduce new params if variable exists
* test: fix bedrock nova tests
* test: skip flaky test
* test: skip flaky test in ci/cd
* fix(o_series_transformation.py): fix optional param check for o-series models
o3-mini and o-1 do not support parallel tool calling
* fix(utils.py): support 'drop_params' for 'thinking' param across models
allows switching to older claude versions (or non-anthropic models) and param to be safely dropped
* fix: fix passing thinking param in optional params
allows dropping thinking_param where not applicable
* test: update old model
* fix(utils.py): fix linting errors
* fix(main.py): add param to acompletion
* fix: remove aws params from bedrock embedding request body (#8618)
* fix: remove aws params from bedrock embedding request body
* fix-7548: handle aws params in base class
* test: load request data from mock call
* (Infra/DB) - Allow running older litellm version when out of sync with current state of DB (#8695)
* fix check migration
* clean up should_update_prisma_schema
* update test
* db_migration_disable_update_check
* Check container logs for expected message
* db_migration_disable_update_check
* test_check_migration_out_of_sync
* test_should_update_prisma_schema
* db_migration_disable_update_check
* pip install aiohttp
* ui new build
* delete deprecated code test
* bump: version 1.61.12 → 1.61.13
* Add cost tracking for rerank via bedrock (#8691)
* feat(bedrock/rerank): infer model region if model given as arn
* test: add unit testing to ensure bedrock region name inferred from arn on rerank
* feat(bedrock/rerank/transformation.py): include search units for bedrock rerank result
Resolves https://github.com/BerriAI/litellm/issues/7258#issuecomment-2671557137
* test(test_bedrock_completion.py): add testing for bedrock cohere rerank
* feat(cost_calculator.py): refactor rerank cost tracking to support bedrock cost tracking
* build(model_prices_and_context_window.json): add amazon.rerank model to model cost map
* fix(cost_calculator.py): bedrock/common_utils.py
get base model from model w/ arn -> handles rerank model
* build(model_prices_and_context_window.json): add bedrock cohere rerank pricing
* feat(bedrock/rerank): migrate bedrock config to basererank config
* Revert "feat(bedrock/rerank): migrate bedrock config to basererank config"
This reverts commit 84fae1f167.
* test: add testing to ensure large doc / queries are correctly counted
* Revert "test: add testing to ensure large doc / queries are correctly counted"
This reverts commit 4337f1657e.
* fix(migrate-jina-ai-to-rerank-config): enables cost tracking
* refactor(jina_ai/): finish migrating jina ai to base rerank config
enables cost tracking
* fix(jina_ai/rerank): e2e jina ai rerank cost tracking
* fix: cleanup dead code
* fix: fix python3.8 compatibility error
* test: fix test
* test: add e2e testing for azure ai rerank
* fix: fix linting error
* test: mark cohere as flaky
* add bedrock llama vision support + cohere / infinity rerank - 'return_documents' support (#8684)
* build(model_prices_and_context_window.json): mark bedrock llama as supporting vision based on docs
* Add price for Cerebras llama3.3-70b (#8676)
* docs(readme.md): fix contributing docs
point people to new mock directory testing structure s/o @vibhavbhat
* build: update contributing readme
* docs(readme.md): improve docs
* docs(readme.md): cleanup readme on tests/
* docs(README.md): cleanup doc
* feat(infinity/): support returning documents when return_documents=True
* test(test_rerank.py): add e2e testing for cohere rerank
* fix: fix linting errors
* fix(together_ai/): fix together ai transformation
* fix: fix linting error
* fix: fix linting errors
* fix: fix linting errors
* test: mark cohere as flaky
* build: fix model supports check
* test: fix test
* test: mark flaky test
* fix: fix test
* test: fix test
---------
Co-authored-by: Yury Koleda <fut.wrk@gmail.com>
* test: fix test
* fix: remove unused import
* bump: version 1.61.13 → 1.61.14
* Correct spelling in user_management_heirarchy.md (#8716)
Fixing irritating typo -- page and image names would also need to be updated
* (Feat) - UI, Allow sorting models by Created_At and all other columns on the UI (#8725)
* order models by created at
* use existing table component on models page
* sorting for created at
* ui clean up models page
* remove provider filter
* fix columns sorting
* decent switching
* ui fix models page
* (UI) Edit Model flow improvements (#8729)
* order models by created at
* use existing table component on models page
* sorting for created at
* ui clean up models page
* remove provider filter
* fix columns sorting
* decent switching
* ui fix models page
* show edit / delete button on root of table
* clean up columns
* working edit model flow
* decent working model edit page
* fix edit model
* show created at and created by
* ui easy model edit flow
* clean up columns
* ui clean up updated at
* fix model datatable
* ui new build
* bump: version 1.61.14 → 1.61.15
* Support arize phoenix on litellm proxy (#7756) (#8715)
* Update opentelemetry.py
wip
* Update test_opentelemetry_unit_tests.py
* fix a few paths and tests
* fix path
* Update litellm_logging.py
* accidentally removed code
* Add type for protocol
* Add and update tests
* minor changes
* update and add additional arize phoenix test
* update existing test
* address feedback
* use standard_logging_object
* address feedback
Co-authored-by: Nate Mar <67926244+nate-mar@users.noreply.github.com>
* fix(amazon_deepseek_transformation.py): remove </think> from stream o… (#8717)
* fix(amazon_deepseek_transformation.py): remove </think> from stream output - cleanup user facing stream
* fix(key_managenet_endpoints.py): return `/key/list` sorted by created_at
makes it easier to see created key
* style: cleanup team table
* feat(key_edit_view.tsx): support setting model specific tpm/rpm limits on keys
* Add cohere v2/rerank support (#8421) (#8605)
* Add cohere v2/rerank support (#8421)
* Support v2 endpoint cohere rerank
* Add tests and docs
* Make v1 default if old params used
* Update docs
* Update docs pt 2
* Update tests
* Add e2e test
* Clean up code
* Use inheritence for new config
* Fix linting issues (#8608)
* Fix cohere v2 failing test + linting (#8672)
* Fix test and unused imports
* Fix tests
* fix: fix linting errors
* test: handle tgai instability
* fix: skip service unavailable err
* test: print logs for unstable test
* test: skip unreliable tests
---------
Co-authored-by: vibhavbhat <vibhavb00@gmail.com>
* fix(proxy/_types.py): fixes issue where internal user able to escalat… (#8740)
* fix(proxy/_types.py): fixes issue where internal user able to escalate their role with ui key
Fixes https://github.com/BerriAI/litellm/issues/8029
* style: cleanup
* test: handle bedrock instability
---------
Co-authored-by: Madhukar Holla <mholla8@gmail.com>
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: Yury Koleda <fut.wrk@gmail.com>
Co-authored-by: Oskar Austegard <oskar@austegard.com>
Co-authored-by: Nate Mar <67926244+nate-mar@users.noreply.github.com>
Co-authored-by: vibhavbhat <vibhavb00@gmail.com>