Commit graph

2223 commits

Author SHA1 Message Date
Ishaan Jaff
6231052b18
[Bug]: Deepseek error on proxy after upgrading to 1.61.13-stable (#8860)
* fix deepseek error

* test_deepseek_provider_async_completion

* fix get_complete_url
2025-02-26 21:11:06 -08:00
Ishaan Jaff
c07dd16d88
(UI + Backend) Fix Adding Azure, Azure AI Studio models on LiteLLM (#8856)
* fix Azure_AI_Studio

* fix flow for adding custom model names

* fix _should_use_api_key_header

* handle custom model name change

* test_azure_ai_request_format

* Azure AI Foundry (Studio)

* fix _should_use_api_key_header
2025-02-26 18:45:02 -08:00
Krrish Dholakia
6c669734a6 fix(converse_transformation.py): fix 'thinking' param check for claude-3-7 on bedrock 2025-02-26 16:28:50 -08:00
Krish Dholakia
ab7c4d1a0e
Litellm dev bedrock anthropic 3 7 v2 (#8843)
* feat(bedrock/converse/transformation.py): support claude-3-7-sonnet reasoning_Content transformation

Closes https://github.com/BerriAI/litellm/issues/8777

* fix(bedrock/): support returning `reasoning_content` on streaming for claude-3-7

Resolves https://github.com/BerriAI/litellm/issues/8777

* feat(bedrock/): unify converse reasoning content blocks for consistency across anthropic and bedrock

* fix(anthropic/chat/transformation.py): handle deepseek-style 'reasoning_content' extraction within transformation.py

simpler logic

* feat(bedrock/): fix streaming to return blocks in consistent format

* fix: fix linting error

* test: fix test

* feat(factory.py): fix bedrock thinking block translation on tool calling

allows passing the thinking blocks back to bedrock for tool calling

* fix(types/utils.py): don't exclude provider_specific_fields on model dump

ensures consistent responses

* fix: fix linting errors

* fix(convert_dict_to_response.py): pass reasoning_content on root

* fix: test

* fix(streaming_handler.py): add helper util for setting model id

* fix(streaming_handler.py): fix setting model id on model response stream chunk

* fix(streaming_handler.py): fix linting error

* fix(streaming_handler.py): fix linting error

* fix(types/utils.py): add provider_specific_fields to model stream response

* fix(streaming_handler.py): copy provider specific fields and add them to the root of the streaming response

* fix(streaming_handler.py): fix check

* fix: fix test

* fix(types/utils.py): ensure messages content is always openai compatible

* fix(types/utils.py): fix delta object to always be openai compatible

only introduce new params if variable exists

* test: fix bedrock nova tests

* test: skip flaky test

* test: skip flaky test in ci/cd
2025-02-26 16:05:33 -08:00
Krish Dholakia
017c482d7b
fix(o_series_transformation.py): fix optional param check for o-serie… (#8787)
* fix(o_series_transformation.py): fix optional param check for o-series models

o3-mini and o-1 do not support parallel tool calling

* fix(utils.py): support 'drop_params' for 'thinking' param across models

allows switching to older claude versions (or non-anthropic models) and param to be safely dropped

* fix: fix passing thinking param in optional params

allows dropping thinking_param where not applicable

* test: update old model

* fix(utils.py): fix linting errors

* fix(main.py): add param to acompletion
2025-02-26 12:26:55 -08:00
Ishaan Jaff
f9cee4c46b
(Bug Fix) Using LiteLLM Python SDK with model=litellm_proxy/ for embedding, image_generation, transcription, speech, rerank (#8815)
* test_litellm_gateway_from_sdk

* fix embedding check for openai

* test litellm proxy provider

* fix image generation openai compatible models

* fix litellm.transcription

* test_litellm_gateway_from_sdk_rerank

* docs litellm python sdk

* docs litellm python sdk with proxy

* test_litellm_gateway_from_sdk_rerank

* ci/cd run again

* test_litellm_gateway_from_sdk_image_generation

* test_litellm_gateway_from_sdk_embedding

* test_litellm_gateway_from_sdk_embedding
2025-02-25 16:22:37 -08:00
Krish Dholakia
142b195784
Add anthropic thinking + reasoning content support (#8778)
* feat(anthropic/chat/transformation.py): add anthropic thinking param support

* feat(anthropic/chat/transformation.py): support returning thinking content for anthropic on streaming responses

* feat(anthropic/chat/transformation.py): return list of thinking blocks (include block signature)

allows usage in tool call responses

* fix(types/utils.py): extract and map reasoning_content from anthropic as content str

* test: add testing to ensure thinking_blocks are returned at the root

* fix(anthropic/chat/handler.py): return thinking blocks on streaming - include signature

* feat(factory.py): handle anthropic thinking blocks translation if in assistant response

* test: handle openai internal instability

* test: handle openai audio instability

* ci: pin anthropic dep

* test: handle openai audio instability

* fix: fix linting error

* refactor(anthropic/chat/transformation.py): refactor function to remain <50 LOC

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error
2025-02-24 21:54:30 -08:00
Ishaan Jaff
b93889660a
fix: remove aws params from bedrock embedding request body (#8618) (#8696)
* fix: remove aws params from bedrock embedding request body (#8618)

* fix: remove aws params from bedrock embedding request body

* fix-7548: handle aws params in base class

* test: load request data from mock call

* (Infra/DB) - Allow running older litellm version when out of sync with current state of DB  (#8695)

* fix check migration

* clean up should_update_prisma_schema

* update test

* db_migration_disable_update_check

* Check container logs for expected message

* db_migration_disable_update_check

* test_check_migration_out_of_sync

* test_should_update_prisma_schema

* db_migration_disable_update_check

* pip install aiohttp

* ui new build

* delete deprecated code test

* bump: version 1.61.12 → 1.61.13

* Add cost tracking for rerank via bedrock (#8691)

* feat(bedrock/rerank): infer model region if model given as arn

* test: add unit testing to ensure bedrock region name inferred from arn on rerank

* feat(bedrock/rerank/transformation.py): include search units for bedrock rerank result

Resolves https://github.com/BerriAI/litellm/issues/7258#issuecomment-2671557137

* test(test_bedrock_completion.py): add testing for bedrock cohere rerank

* feat(cost_calculator.py): refactor rerank cost tracking to support bedrock cost tracking

* build(model_prices_and_context_window.json): add amazon.rerank model to model cost map

* fix(cost_calculator.py): bedrock/common_utils.py

get base model from model w/ arn -> handles rerank model

* build(model_prices_and_context_window.json): add bedrock cohere rerank pricing

* feat(bedrock/rerank): migrate bedrock config to basererank config

* Revert "feat(bedrock/rerank): migrate bedrock config to basererank config"

This reverts commit 84fae1f167.

* test: add testing to ensure large doc / queries are correctly counted

* Revert "test: add testing to ensure large doc / queries are correctly counted"

This reverts commit 4337f1657e.

* fix(migrate-jina-ai-to-rerank-config): enables cost tracking

* refactor(jina_ai/): finish migrating jina ai to base rerank config

enables cost tracking

* fix(jina_ai/rerank): e2e jina ai rerank cost tracking

* fix: cleanup dead code

* fix: fix python3.8 compatibility error

* test: fix test

* test: add e2e testing for azure ai rerank

* fix: fix linting error

* test: mark cohere as flaky

* add bedrock llama vision support + cohere / infinity rerank - 'return_documents' support  (#8684)

* build(model_prices_and_context_window.json): mark bedrock llama as supporting vision based on docs

* Add price for Cerebras llama3.3-70b (#8676)

* docs(readme.md): fix contributing docs

point people to new mock directory testing structure s/o @vibhavbhat

* build: update contributing readme

* docs(readme.md): improve docs

* docs(readme.md): cleanup readme on tests/

* docs(README.md): cleanup doc

* feat(infinity/): support returning documents when return_documents=True

* test(test_rerank.py): add e2e testing for cohere rerank

* fix: fix linting errors

* fix(together_ai/): fix together ai transformation

* fix: fix linting error

* fix: fix linting errors

* fix: fix linting errors

* test: mark cohere as flaky

* build: fix model supports check

* test: fix test

* test: mark flaky test

* fix: fix test

* test: fix test

---------

Co-authored-by: Yury Koleda <fut.wrk@gmail.com>

* test: fix test

* fix: remove unused import

* bump: version 1.61.13 → 1.61.14

* Correct spelling in user_management_heirarchy.md (#8716)

Fixing irritating typo -- page and image names would also need to be updated

* (Feat) - UI, Allow sorting models by Created_At and all other columns on the UI (#8725)

* order models by created at

* use existing table component on models page

* sorting for created at

* ui clean up models page

* remove provider filter

* fix columns sorting

* decent switching

* ui fix models page

* (UI) Edit Model flow improvements (#8729)

* order models by created at

* use existing table component on models page

* sorting for created at

* ui clean up models page

* remove provider filter

* fix columns sorting

* decent switching

* ui fix models page

* show edit / delete button on root of table

* clean up columns

* working edit model flow

* decent working model edit page

* fix edit model

* show created at and created by

* ui easy model edit flow

* clean up columns

* ui clean up updated at

* fix model datatable

* ui new build

* bump: version 1.61.14 → 1.61.15

* Support arize phoenix on litellm proxy (#7756) (#8715)

* Update opentelemetry.py

wip

* Update test_opentelemetry_unit_tests.py

* fix a few paths and tests

* fix path

* Update litellm_logging.py

* accidentally removed code

* Add type for protocol

* Add and update tests

* minor changes

* update and add additional arize phoenix test

* update existing test

* address feedback

* use standard_logging_object

* address feedback

Co-authored-by: Nate Mar <67926244+nate-mar@users.noreply.github.com>

* fix(amazon_deepseek_transformation.py): remove </think> from stream o… (#8717)

* fix(amazon_deepseek_transformation.py): remove </think> from stream output - cleanup user facing stream

* fix(key_managenet_endpoints.py): return `/key/list` sorted by created_at

makes it easier to see created key

* style: cleanup team table

* feat(key_edit_view.tsx): support setting model specific tpm/rpm limits on keys

* Add cohere v2/rerank support (#8421) (#8605)

* Add cohere v2/rerank support (#8421)

* Support v2 endpoint cohere rerank

* Add tests and docs

* Make v1 default if old params used

* Update docs

* Update docs pt 2

* Update tests

* Add e2e test

* Clean up code

* Use inheritence for new config

* Fix linting issues (#8608)

* Fix cohere v2 failing test + linting (#8672)

* Fix test and unused imports

* Fix tests

* fix: fix linting errors

* test: handle tgai instability

* fix: skip service unavailable err

* test: print logs for unstable test

* test: skip unreliable tests

---------

Co-authored-by: vibhavbhat <vibhavb00@gmail.com>

* fix(proxy/_types.py): fixes issue where internal user able to escalat… (#8740)

* fix(proxy/_types.py): fixes issue where internal user able to escalate their role with ui key

Fixes https://github.com/BerriAI/litellm/issues/8029

* style: cleanup

* test: handle bedrock instability

---------

Co-authored-by: Madhukar Holla <mholla8@gmail.com>
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: Yury Koleda <fut.wrk@gmail.com>
Co-authored-by: Oskar Austegard <oskar@austegard.com>
Co-authored-by: Nate Mar <67926244+nate-mar@users.noreply.github.com>
Co-authored-by: vibhavbhat <vibhavb00@gmail.com>
2025-02-24 10:04:58 -08:00
Nitin Patel
e3455cd045
fix missing comma 2025-02-24 01:00:07 +05:30
Krish Dholakia
09462ba80c
Add cohere v2/rerank support (#8421) (#8605)
* Add cohere v2/rerank support (#8421)

* Support v2 endpoint cohere rerank

* Add tests and docs

* Make v1 default if old params used

* Update docs

* Update docs pt 2

* Update tests

* Add e2e test

* Clean up code

* Use inheritence for new config

* Fix linting issues (#8608)

* Fix cohere v2 failing test + linting (#8672)

* Fix test and unused imports

* Fix tests

* fix: fix linting errors

* test: handle tgai instability

* fix: skip service unavailable err

* test: print logs for unstable test

* test: skip unreliable tests

---------

Co-authored-by: vibhavbhat <vibhavb00@gmail.com>
2025-02-22 22:25:29 -08:00
Krish Dholakia
c2aec21b4d
fix(amazon_deepseek_transformation.py): remove </think> from stream o… (#8717)
* fix(amazon_deepseek_transformation.py): remove </think> from stream output - cleanup user facing stream

* fix(key_managenet_endpoints.py): return `/key/list` sorted by created_at

makes it easier to see created key

* style: cleanup team table

* feat(key_edit_view.tsx): support setting model specific tpm/rpm limits on keys
2025-02-22 21:46:55 -08:00
Krrish Dholakia
84eb633138 fix: remove unused import 2025-02-20 21:39:30 -08:00
Krish Dholakia
251467a525
add bedrock llama vision support + cohere / infinity rerank - 'return_documents' support (#8684)
* build(model_prices_and_context_window.json): mark bedrock llama as supporting vision based on docs

* Add price for Cerebras llama3.3-70b (#8676)

* docs(readme.md): fix contributing docs

point people to new mock directory testing structure s/o @vibhavbhat

* build: update contributing readme

* docs(readme.md): improve docs

* docs(readme.md): cleanup readme on tests/

* docs(README.md): cleanup doc

* feat(infinity/): support returning documents when return_documents=True

* test(test_rerank.py): add e2e testing for cohere rerank

* fix: fix linting errors

* fix(together_ai/): fix together ai transformation

* fix: fix linting error

* fix: fix linting errors

* fix: fix linting errors

* test: mark cohere as flaky

* build: fix model supports check

* test: fix test

* test: mark flaky test

* fix: fix test

* test: fix test

---------

Co-authored-by: Yury Koleda <fut.wrk@gmail.com>
2025-02-20 21:23:54 -08:00
Krish Dholakia
b682dc4ec8
Add cost tracking for rerank via bedrock (#8691)
* feat(bedrock/rerank): infer model region if model given as arn

* test: add unit testing to ensure bedrock region name inferred from arn on rerank

* feat(bedrock/rerank/transformation.py): include search units for bedrock rerank result

Resolves https://github.com/BerriAI/litellm/issues/7258#issuecomment-2671557137

* test(test_bedrock_completion.py): add testing for bedrock cohere rerank

* feat(cost_calculator.py): refactor rerank cost tracking to support bedrock cost tracking

* build(model_prices_and_context_window.json): add amazon.rerank model to model cost map

* fix(cost_calculator.py): bedrock/common_utils.py

get base model from model w/ arn -> handles rerank model

* build(model_prices_and_context_window.json): add bedrock cohere rerank pricing

* feat(bedrock/rerank): migrate bedrock config to basererank config

* Revert "feat(bedrock/rerank): migrate bedrock config to basererank config"

This reverts commit 84fae1f167.

* test: add testing to ensure large doc / queries are correctly counted

* Revert "test: add testing to ensure large doc / queries are correctly counted"

This reverts commit 4337f1657e.

* fix(migrate-jina-ai-to-rerank-config): enables cost tracking

* refactor(jina_ai/): finish migrating jina ai to base rerank config

enables cost tracking

* fix(jina_ai/rerank): e2e jina ai rerank cost tracking

* fix: cleanup dead code

* fix: fix python3.8 compatibility error

* test: fix test

* test: add e2e testing for azure ai rerank

* fix: fix linting error

* test: mark cohere as flaky
2025-02-20 21:00:18 -08:00
Ishaan Jaff
300d7825f5
(Observability) - Add more detailed dd tracing on Proxy Auth, Bedrock Auth (#8693)
* add dd tracer

* fix dd tracing

* add @tracer.wrap() on def user_api_key_auth

* add async_function_with_retries

* remove dead code

* add tracer.wrap on base aws llm

* add tracer.wrap on base aws llm

* fix print verbose

* fix dd tracing

* trace base aws llm

* fix test base aws llm

* fix converse transform

* test base aws llm

* BASE_AWS_LLM_PATH

* BASE_AWS_LLM_PATH

* test dd tracing
2025-02-20 18:00:41 -08:00
Krrish Dholakia
9470f57e86 build: extract <think>..</think> block for amazon deepseek r1 and put in reasoning_content 2025-02-19 21:10:38 -08:00
Krish Dholakia
2b7755f8d8
Litellm dev 02 18 2025 p3 (#8640)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
* fix(team_endpoints.py): cleanup user <-> team association on team delete

Fixes issue where user table still listed team membership post delete

* test(test_team.py): update e2e test - ensure user/team membership is deleted on team delete

* fix(base_invoke_transformation.py): fix deepseek r1 transformation

remove deepseek name from model url

* test(test_completion.py): assert model route not in url

* feat(base_invoke_transformation.py): infer region name from model arn

prevent errors due to different region name in env var vs. model arn, respect if explicitly set in call though

* test: fix test

* test: skip on internal server error
2025-02-18 19:14:20 -08:00
Krish Dholakia
0c0f6c23e2
feat(openai/o_series_transformation.py): support native streaming for all openai o-series models (#8552)
o1 now supports streaming
2025-02-14 20:04:19 -08:00
Ishaan Jaff
125f6fff67
(Feat) - Add /bedrock/meta.llama3-3-70b-instruct-v1:0 tool calling support + cost tracking + base llm unit test for tool calling (#8545)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
* Add support for bedrock meta.llama3-3-70b-instruct-v1:0 tool calling (#8512)

* fix(converse_transformation.py): fixing bedrock meta.llama3-3-70b tool calling

* test(test_bedrock_completion.py): adding llama3.3 tool compatibility check

* add TestBedrockTestSuite

* add bedrock llama 3.3 to base llm class

* us.meta.llama3-3-70b-instruct-v1:0

* test_basic_tool_calling

* TestAzureOpenAIO1

* test_basic_tool_calling

* test_basic_tool_calling

---------

Co-authored-by: miraclebakelaser <65143272+miraclebakelaser@users.noreply.github.com>
2025-02-14 14:15:25 -08:00
Krish Dholakia
58141df65d
Litellm dev 02 13 2025 p2 (#8525)
* fix(azure/chat/gpt_transformation.py): add 'prediction' as a support azure param

Closes https://github.com/BerriAI/litellm/issues/8500

* build(model_prices_and_context_window.json): add new 'gemini-2.0-pro-exp-02-05' model

* style: cleanup invalid json trailing commma

* feat(utils.py): support passing 'tokenizer_config' to register_prompt_template

enables passing complete tokenizer config of model to litellm

 Allows calling deepseek on bedrock with the correct prompt template

* fix(utils.py): fix register_prompt_template for custom model names

* test(test_prompt_factory.py): fix test

* test(test_completion.py): add e2e test for bedrock invoke deepseek ft model

* feat(base_invoke_transformation.py): support hf_model_name param for bedrock invoke calls

enables proxy admin to set base model for ft bedrock deepseek model

* feat(bedrock/invoke): support deepseek_r1 route for bedrock

makes it easy to apply the right chat template to that call

* feat(constants.py): store deepseek r1 chat template - allow user to get correct response from deepseek r1 without extra work

* test(test_completion.py): add e2e mock test for bedrock deepseek

* docs(bedrock.md): document new deepseek_r1 route for bedrock

allows us to use the right config

* fix(exception_mapping_utils.py): catch read operation timeout
2025-02-13 20:28:42 -08:00
Krish Dholakia
8903bd1c7f
fix(utils.py): fix vertex ai optional param handling (#8477)
* fix(utils.py): fix vertex ai optional param handling

don't pass max retries to unsupported route

Fixes https://github.com/BerriAI/litellm/issues/8254

* fix(get_supported_openai_params.py): fix linting error

* fix(get_supported_openai_params.py): default to openai-like spec

* test: fix test

* fix: fix linting error

* Improved wildcard route handling on `/models` and `/model_group/info`  (#8473)

* fix(model_checks.py): update returning known model from wildcard to filter based on given model prefix

ensures wildcard route - `vertex_ai/gemini-*` just returns known vertex_ai/gemini- models

* test(test_proxy_utils.py): add unit testing for new 'get_known_models_from_wildcard' helper

* test(test_models.py): add e2e testing for `/model_group/info` endpoint

* feat(prometheus.py): support tracking total requests by user_email on prometheus

adds initial support for tracking total requests by user_email

* test(test_prometheus.py): add testing to ensure user email is always tracked

* test: update testing for new prometheus metric

* test(test_prometheus_unit_tests.py): add user email to total proxy metric

* test: update tests

* test: fix spend tests

* test: fix test

* fix(pagerduty.py): fix linting error

* (Bug fix) - Using `include_usage` for /completions requests + unit testing (#8484)

* pass stream options (#8419)

* test_completion_streaming_usage_metrics

* test_text_completion_include_usage

---------

Co-authored-by: Kaushik Deka <55996465+Kaushikdkrikhanu@users.noreply.github.com>

* fix naming docker stable release

* build(model_prices_and_context_window.json): handle azure model update

* docs(token_auth.md): clarify scopes can be a list or comma separated string

* docs: fix docs

* add sonar pricings (#8476)

* add sonar pricings

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window_backup.json

* update load testing script

* fix test_async_router_context_window_fallback

* pplx - fix supports tool choice openai param (#8496)

* fix prom check startup (#8492)

* test_async_router_context_window_fallback

* ci(config.yml): mark daily docker builds with `-nightly` (#8499)

Resolves https://github.com/BerriAI/litellm/discussions/8495

* (Redis Cluster) - Fixes for using redis cluster + pipeline (#8442)

* update RedisCluster creation

* update RedisClusterCache

* add redis ClusterCache

* update async_set_cache_pipeline

* cleanup redis cluster usage

* fix redis pipeline

* test_init_async_client_returns_same_instance

* fix redis cluster

* update mypy_path

* fix init_redis_cluster

* remove stub

* test redis commit

* ClusterPipeline

* fix import

* RedisCluster import

* fix redis cluster

* Potential fix for code scanning alert no. 2129: Clear-text logging of sensitive information

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* fix naming of redis cluster integration

* test_redis_caching_ttl_pipeline

* fix async_set_cache_pipeline

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Litellm UI stable version 02 12 2025 (#8497)

* fix(key_management_endpoints.py): fix `/key/list` to include `return_full_object` as a top-level query param

Allows user to specify they want the keys as a list of objects

* refactor(key_list.tsx): initial refactor of key table in user dashboard

offloads key filtering logic to backend api

prevents common error of user not being able to see their keys

* fix(key_management_endpoints.py): allow internal user to query `/key/list` to see their keys

* fix(key_management_endpoints.py): add validation checks and filtering to `/key/list` endpoint

allow internal user to see their keys. not anybody else's

* fix(view_key_table.tsx): fix issue where internal user could not see default team keys

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* test_supports_tool_choice

* test_async_router_context_window_fallback

* fix: fix test (#8501)

* Litellm dev 02 12 2025 p1 (#8494)

* Resolves https://github.com/BerriAI/litellm/issues/6625 (#8459)

- enables no auth for SMTP

Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com>

* add sonar pricings (#8476)

* add sonar pricings

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window_backup.json

* test: fix test

---------

Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com>
Co-authored-by: Dani Regli <1daniregli@gmail.com>
Co-authored-by: Lucca Zenóbio <luccazen@gmail.com>

* test: fix test

* UI Fixes p2  (#8502)

* refactor(admin.tsx): cleanup add new admin flow

removes buggy flow. Ensures just 1 simple way to add users / update roles.

* fix(user_search_modal.tsx): ensure 'add member' button is always visible

* fix(edit_membership.tsx): ensure 'save changes' button always visible

* fix(internal_user_endpoints.py): ensure user in org can be deleted

Fixes issue where user couldn't be deleted if they were a member of an org

* fix: fix linting error

* add phoenix docs for observability integration (#8522)

* Add files via upload

* Update arize_integration.md

* Update arize_integration.md

* add Phoenix docs

* Added custom_attributes to additional_keys which can be sent to athina (#8518)

* (UI) fix log details page  (#8524)

* rollback changes to view logs page

* ui new build

* add interface for prefetch

* fix spread operation

* fix max size for request view page

* clean up table

* ui fix column on request logs page

* ui new build

* Add UI Support for Admins to Call /cache/ping and View Cache Analytics (#8475) (#8519)

* [Bug] UI: Newly created key does not display on the View Key Page (#8039)

- Fixed issue where all keys appeared blank for admin users.
- Implemented filtering of data via team settings to ensure all keys are displayed correctly.

* Fix:
- Updated the validator to allow model editing when `keyTeam.team_alias === "Default Team"`.
- Ensured other teams still follow the original validation rules.

* - added some classes in global.css
- added text wrap in output of request,response and metadata in index.tsx
- fixed styles of table in table.tsx

* - added full payload when we open single log entry
- added Combined Info Card in index.tsx

* fix: keys not showing on refresh for internal user

* merge

* main merge

* cache page

* ca remove

* terms change

* fix:places caching inside exp

---------

Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Kaushik Deka <55996465+Kaushikdkrikhanu@users.noreply.github.com>
Co-authored-by: Lucca Zenóbio <luccazen@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Dani Regli <1daniregli@gmail.com>
Co-authored-by: exiao <exiao@users.noreply.github.com>
Co-authored-by: vivek-athina <153479827+vivek-athina@users.noreply.github.com>
Co-authored-by: Taha Ali <123803932+tahaali-dev@users.noreply.github.com>
2025-02-13 19:58:50 -08:00
Minwoo Lee
c62be184c2
Fix get_complete_url 2025-02-13 16:41:50 +09:00
Minwoo Lee
2687027225
Apply streaming-related transformations only for generate config 2025-02-13 15:40:56 +09:00
Minwoo Lee
c1d0559579
Merge branch 'BerriAI:main' into triton-completions-streaming-fix 2025-02-13 15:37:53 +09:00
Ishaan Jaff
ca7c5e8773
pplx - fix supports tool choice openai param (#8496) 2025-02-12 17:21:16 -08:00
Krish Dholakia
e26d7df91b
Litellm dev 02 10 2025 p2 (#8443)
* Fixed issue #8246 (#8250)

* Fixed issue #8246

* Added unit tests for discard() and for remove_callback_from_list_by_object()

* fix(openai.py): support dynamic passing of organization param to openai

handles scenario where client-side org id is passed to openai

---------

Co-authored-by: Erez Hadad <erezh@il.ibm.com>
2025-02-10 17:53:46 -08:00
Krish Dholakia
47f46f92c8
Litellm dev 02 10 2025 p1 (#8438)
* fix(azure/chat/gpt_transformation.py): fix str compare to use int - ensure correct api version check is done

Resolves https://github.com/BerriAI/litellm/issues/8241#issuecomment-2647142891

* test(test_azure_openai.py): add better testing
2025-02-10 16:25:04 -08:00
Ishaan Jaff
88093608e3
(Bug Fix) - Bedrock completions with aws_region_name (#8384)
* test_bedrock_completion_with_region_name

* test_bedrock_base_model_helper

* test_bedrock_base_model_helper

* fix aws_bedrock_runtime_endpoint

* test_dynamic_aws_params_propagation

* test_dynamic_aws_params_propagation
2025-02-08 16:33:17 -08:00
Ishaan Jaff
12ac414839
(Feat) - Allow calling Nova models on /bedrock/invoke/ (#8397)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 16s
* add nova to BEDROCK_INVOKE_PROVIDERS_LITERAL

* BedrockInvokeNovaRequest

* nova + invoke config

* add AmazonInvokeNovaConfig

* AmazonInvokeNovaConfig

* run transform_request for invoke/nova models

* AmazonInvokeNovaConfig

* rename invoke tests

* fix linting error

* TestBedrockInvokeNovaJson

* TestBedrockInvokeNovaJson

* add converse_chunk_parser

* test_nova_invoke_remove_empty_system_messages

* test_nova_invoke_streaming_chunk_parsing
2025-02-08 13:03:05 -08:00
Krish Dholakia
c83498f0cd
Litellm dev 02 07 2025 p3 (#8387)
* add back streaming for base o3 (#8361)

* test(base_llm_unit_tests.py): add base test for o-series models - ensure streaming always works

* fix(base_llm_unit_tests.py): fix test for o series models

* refactor: move test

---------

Co-authored-by: Matteo Boschini <12133566+mbosc@users.noreply.github.com>
2025-02-07 22:45:45 -08:00
Ishaan Jaff
b242c66a3b
(Feat) - Add /bedrock/invoke support for all Anthropic models (#8383)
* use anthropic transformation for bedrock/invoke

* use anthropic transforms for bedrock invoke claude

* TestBedrockInvokeClaudeJson

* add AmazonAnthropicClaudeStreamDecoder

* pass bedrock_invoke_provider to make_call

* fix _get_base_bedrock_model

* fix get_bedrock_route

* fix bedrock routing

* fixes for bedrock invoke

* test_all_model_configs

* fix AWSEventStreamDecoder linting

* fix code qa

* test_bedrock_get_base_model

* test_get_model_info_bedrock_models

* test_bedrock_base_model_helper

* test_bedrock_route_detection
2025-02-07 22:41:11 -08:00
Krish Dholakia
1dd3713f1a
Anthropic Citations API Support (#8382)
* test(test_anthropic_completion.py): add test ensuring anthropic structured output response is consistent

Resolves https://github.com/BerriAI/litellm/issues/8291

* feat(anthropic.py): support citations api with new user document message format

Resolves https://github.com/BerriAI/litellm/issues/7970

* fix(anthropic/chat/transformation.py): return citations as a provider-specific-field

Resolves https://github.com/BerriAI/litellm/issues/7970

* feat(anthropic/chat/handler.py): add streaming citations support

Resolves https://github.com/BerriAI/litellm/issues/7970

* fix(handler.py): fix code qa error

* fix(handler.py): only set provider specific fields if non-empty dict

* docs(anthropic.md): add citations api to anthropic docs
2025-02-07 22:27:01 -08:00
Minwoo Lee
0c30909fe9
Reimplement methods required for triton streaming 2025-02-08 12:31:01 +09:00
Krish Dholakia
b5850b6b65
Handle azure deepseek reasoning response (#8288) (#8366)
* Handle azure deepseek reasoning response (#8288)

* Handle deepseek reasoning response

* Add helper method + unit test

* Fix: Follow infinity api url format (#8346)

* Follow infinity api url format

* Update test_infinity.py

* fix(infinity/transformation.py): fix linting error

---------

Co-authored-by: vibhavbhat <vibhavb00@gmail.com>
Co-authored-by: Hao Shan <53949959+haoshan98@users.noreply.github.com>
2025-02-07 17:45:51 -08:00
Krish Dholakia
f651d51f26
Litellm dev 02 07 2025 p2 (#8377)
* fix(caching_routes.py): mask redis password on `/cache/ping` route

* fix(caching_routes.py): fix linting erro

* fix(caching_routes.py): fix linting error on caching routes

* fix: fix test - ignore mask_dict - has a breakpoint

* fix(azure.py): add timeout param + elapsed time in azure timeout error

* fix(http_handler.py): add elapsed time to http timeout request

makes it easier to debug how long request took before failing
2025-02-07 17:30:38 -08:00
Krish Dholakia
dfbbf0bde8
fix: dictionary changed size during iteration error (#8327) (#8341)
Co-authored-by: Joey Feldberg <joeyfeldberg@users.noreply.github.com>
Co-authored-by: Joey Feldberg <12495578+joeyfeldberg@users.noreply.github.com>
2025-02-07 16:20:28 -08:00
Krish Dholakia
5d170162d3
fix(nvidia_nim/embed.py): add 'dimensions' support (#8302)
* fix(nvidia_nim/embed.py): add 'dimensions' support

Fixes https://github.com/BerriAI/litellm/issues/8238

* fix(proxy_Server.py): initialize router redis cache if setup on proxy

Fixes https://github.com/BerriAI/litellm/issues/6602

* test: add unit testing for new helper function
2025-02-07 16:19:32 -08:00
Krish Dholakia
6b8b49451f
Fix azure max retries error (#8340)
* fix(azure.py): ensure max_retries=0 is respected

Fixes https://github.com/BerriAI/litellm/issues/6129

* fix(test_openai.py): add unit test to ensure openai sdk calls always respect max_retries = 0

* test(test_azure_openai.py): add unit testing for azure_text/ route

* fix(azure.py): fix passing max retries on streaming

* fix(azure.py): fix azure max retries on async completion + streaming

* fix(completion/handler.py): fix azure text async completion + streaming

* test(test_azure_openai.py): ensure azure openai max retries always respected

* test(test_azure_o_series.py): add testing to ensure max retries always respected

* Added gemini providers for 2.0-flash and 2.0-flash lite (#8321)

* Update model_prices_and_context_window.json

added gemini providers for 2.0-flash and 2.0-flash light

* Update model_prices_and_context_window.json

fixed URL

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* Convert tool use arguments to string before counting tokens (#6989)

In at least some cases the `messages["tool_calls"]["function"]["arguments"]` is a dict, not a string. In order to tokenize it properly it needs to be a string. In the case that it is already a string this is a noop, which is also fine.

* build(model_prices_and_context_window.json): add gemini 2.0 flash lite pricing

* build(model_prices_and_context_window.json): add gemini commercial rate limits

* fix(utils.py): fix linting error

* refactor(utils.py): refactor to maintain function size

---------

Co-authored-by: Bardia Khosravi <bardiakhosravi95@gmail.com>
Co-authored-by: Josh Morrow <josh@jcmorrow.com>
2025-02-06 23:20:48 -08:00
Krish Dholakia
443ae55904
Azure OpenAI improvements - o3 native streaming, improved tool call + response format handling (#8292)
* fix(convert_dict_to_response.py): only convert if response is the response_format tool call passed in

Fixes https://github.com/BerriAI/litellm/issues/8241

* fix(gpt_transformation.py): makes sure response format / tools conversion doesn't remove previous tool calls

* refactor(gpt_transformation.py): refactor out json schema converstion to base config

keeps logic consistent across providers

* fix(o_series_transformation.py): support o3 mini native streaming

Fixes https://github.com/BerriAI/litellm/issues/8274

* fix(gpt_transformation.py): remove unused variables

* test: update test
2025-02-05 19:38:58 -08:00
Ishaan Jaff
818792228c
(Refactor) - migrate bedrock invoke to BaseLLMHTTPHandler class (#8290)
* initial transform for invoke

* invoke transform_response

* working - able to make request

* working get_complete_url

* working - invoke now runs on llm_http_handler

* fix unused imports

* track litellm overhead ms

* working stream request

* sign_request transform

* sign_request update

* use has_async_custom_stream_wrapper property

* use get_async_custom_stream_wrapper in base llm http handler

* fix make_call in invoke handler

* fix invoke with streaming get_async_custom_stream_wrapper

* working bedrock async streaming with invoke

* fix make call handler for bedrock

* test_all_model_configs

* fix test_bedrock_custom_prompt_template

* sync streaming for bedrock invoke

* fix _add_stream_param_to_request_body

* test_async_text_completion_bedrock

* fix transform_request

* fix get_supported_openai_params

* fix test supports tool choice

* fix test_supports_tool_choice

* add unit test coverage for bedrock invoke transform

* fix location of transformation files

* update import loc

* fix bedrock invoke unit tests

* fix import for max completion tokens
2025-02-05 18:58:55 -08:00
Krish Dholakia
3c813b3a87
Fix deepseek calling - refactor to use base_llm_http_handler (#8266)
* refactor(deepseek/): move deepseek to base llm http handler

Fixes https://github.com/BerriAI/litellm/issues/8128#issuecomment-2635430457

* fix(gpt_transformation.py): support stream parsing for gpt-like calls

* test(test_deepseek_completion.py): add async streaming test

* fix(gpt_transformation.py): fix import

* fix(gpt_transformation.py): return full api base and content type
2025-02-04 22:30:00 -08:00
Ishaan Jaff
3a6349d871
(Feat) - Add support for structured output on bedrock/nova models + add util litellm.supports_tool_choice (#8264)
* fix supports_tool_choice

* TestBedrockNovaJson

* use supports_tool_choice

* fix supports_tool_choice

* add supports_tool_choice param

* script to add fields to model cost map

* test_supports_tool_choice

* test_supports_tool_choice

* fix supports tool choice check

* test_supports_tool_choice_simple_tests

* fix supports_tool_choice check

* fix supports_tool_choice bedrock

* test_supports_tool_choice

* test_supports_tool_choice

* fix bedrock/eu-west-3/mistral.mistral-large-2402-v1:0

* ci/cd run again

* test_supports_tool_choice_simple_tests

* TestGoogleAIStudioGemini temp - remove to run ci/cd

* test_aaalangfuse_logging_metadata

* TestGoogleAIStudioGemini

* test_check_provider_match

* remove add param to map
2025-02-04 21:47:16 -08:00
Ishaan Jaff
b965d9bd9a
Fix passing top_k parameter for Bedrock Anthropic models (#8131) (#8269)
* Fix Bedrock Anthropic topK bug

* Remove extra import

* Add unit test + make tests mocked

* Fix camel case

* Fix tests to remove exception handling

Co-authored-by: vibhavbhat <vibhavb00@gmail.com>
2025-02-04 21:16:21 -08:00
Krish Dholakia
c8494abdea
test(base_llm_unit_tests.py): add test to ensure drop params is respe… (#8224)
* test(base_llm_unit_tests.py): add test to ensure drop params is respected

* fix(types/prometheus.py): use typing_extensions for python3.8 compatibility

* build: add cherry picked commits
2025-02-03 16:04:44 -08:00
Krish Dholakia
97b8de17ab
LiteLLM Minor Fixes & Improvements (01/16/2025) - p2 (#7828)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
* fix(vertex_ai/gemini/transformation.py): handle 'http://' image urls

* test: add base test for `http:` url's

* fix(factory.py/get_image_details): follow redirects

allows http calls to work

* fix(codestral/): fix stream chunk parsing on last chunk of stream

* Azure ad token provider (#6917)

* Update azure.py

Added optional parameter azure ad token provider

* Added parameter to main.py

* Found token provider arg location

* Fixed embeddings

* Fixed ad token provider

---------

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* fix: fix linting errors

* fix(main.py): leave out o1 route for azure ad token provider, for now

get v0 out for sync azure gpt route to begin with

* test: skip http:// test for fireworks ai

model does not support it

* refactor: cleanup dead code

* fix: revert http:// url passthrough for gemini

google ai studio raises errors

* test: fix test

---------

Co-authored-by: bahtman <anton@baht.dk>
2025-02-02 23:17:50 -08:00
Krish Dholakia
1105e35538
Complete o3 model support (#8183)
* fix(o_series_transformation.py): add 'reasoning_effort' as o series model param

Closes https://github.com/BerriAI/litellm/issues/8182

* fix(main.py): ensure `reasoning_effort` is a mapped openai param

* refactor(azure/): rename o1_[x] files to o_series_[x]

* refactor(base_llm_unit_tests.py): refactor testing for o series reasoning effort

* test(test_azure_o_series.py): have azure o series tests correctly inherit from base o series model tests

* feat(base_utils.py): support translating 'developer' role to 'system' role for non-openai providers

Makes it easy to switch from openai to anthropic

* fix: fix linting errors

* fix(base_llm_unit_tests.py): fix test

* fix(main.py): add missing param
2025-02-02 22:36:37 -08:00
Krish Dholakia
23f458d2da
Improved O3 + Azure O3 support (#8181)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* fix: support azure o3 model family for fake streaming workaround (#8162)

* fix: support azure o3 model family for fake streaming workaround

* refactor: rename helper to is_o_series_model for clarity

* update function calling parameters for o3 models (#8178)

* refactor(o1_transformation.py): refactor o1 config to be o series config, expand o series model check to o3

ensures max_tokens is correctly translated for o3

* feat(openai/): refactor o1 files to be 'o_series' files

expands naming to cover o3

* fix(azure/chat/o1_handler.py): azure openai is an instance of openai - was causing resets

* test(test_azure_o_series.py): assert stream faked for azure o3 mini

Resolves https://github.com/BerriAI/litellm/pull/8162

* fix(o1_transformation.py): fix o1 transformation logic to handle explicit o1_series routing

* docs(azure.md): update doc with `o_series/` model name

---------

Co-authored-by: byrongrogan <47910641+byrongrogan@users.noreply.github.com>
Co-authored-by: Low Jian Sheng <15527690+lowjiansheng@users.noreply.github.com>
2025-02-01 09:52:28 -08:00
Ishaan Jaff
795a71ce1e fix linting error 2025-01-31 21:27:56 -08:00
Ishaan Jaff
9ff27809b2
(Feat) add bedrock/deepseek custom import models (#8132)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 16s
* add support for using llama spec with bedrock

* fix get_bedrock_invoke_provider

* add support for using bedrock provider in mappings

* working request

* test_bedrock_custom_deepseek

* test_bedrock_custom_deepseek

* fix _get_model_id_for_llama_like_model

* test_bedrock_custom_deepseek

* doc DeepSeek-R1-Distill-Llama-70B

* test_bedrock_custom_deepseek
2025-01-31 18:40:44 -08:00
Ishaan Jaff
2cf0daa31c
(Fixes) OpenAI Streaming Token Counting + Fixes usage track when litellm.turn_off_message_logging=True (#8156)
* working streaming usage tracking

* fix test_async_chat_openai_stream_options

* fix await asyncio.sleep(1)

* test_async_chat_azure

* fix s3 logging

* fix get_stream_options

* fix get_stream_options

* fix streaming handler

* test_stream_token_counting_with_redaction

* fix codeql concern
2025-01-31 15:06:37 -08:00