Commit graph

388 commits

Author SHA1 Message Date
Krish Dholakia
f899b828cf
Support openrouter reasoning_content on streaming (#9094)
* feat(convert_dict_to_response.py): support openrouter format of reasoning content

* fix(transformation.py): fix openrouter streaming with reasoning content

Fixes https://github.com/BerriAI/litellm/issues/8193#issuecomment-270892962

* fix: fix type error
2025-03-09 20:03:59 -07:00
Krish Dholakia
e00d4fb18c
Litellm dev 03 08 2025 p3 (#9089)
* feat(ollama_chat.py): pass down http client to ollama_chat

enables easier testing

* fix(factory.py): fix passing images to ollama's `/api/generate` endpoint

Fixes https://github.com/BerriAI/litellm/issues/6683

* fix(factory.py): fix ollama pt to handle templating correctly
2025-03-09 18:20:56 -07:00
Krish Dholakia
4330ef8e81
Fix batches api cost tracking + Log batch models in spend logs / standard logging payload (#9077)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 42s
* feat(batches/): fix batch cost calculation - ensure it's accurate

use the correct cost value - prev. defaulting to non-batch cost

* feat(batch_utils.py): log batch models to spend logs + standard logging payload

makes it easy to understand how cost was calculated

* fix: fix stored payload for test

* test: fix test
2025-03-08 11:47:25 -08:00
Ishaan Jaff
e2d612efd9
Bug fix - String data: stripped from entire content in streamed Gemini responses (#9070)
* _strip_sse_data_from_chunk

* use _strip_sse_data_from_chunk

* use _strip_sse_data_from_chunk

* use _strip_sse_data_from_chunk

* _strip_sse_data_from_chunk

* test_strip_sse_data_from_chunk

* _strip_sse_data_from_chunk

* testing

* _strip_sse_data_from_chunk
2025-03-07 21:06:39 -08:00
Krish Dholakia
0e3caf92b9
UI - new API Playground for testing LiteLLM translation (#9073)
* feat: initial commit - enable dev to see translated request

* feat(utils.py): expose new endpoint - `/utils/transform_request` to see the raw request sent by litellm

* feat(transform_request.tsx): allow user to see their transformed request

* refactor(litellm_logging.py): return raw request in 3 parts - api_base, headers, request body

easier to render each individually on UI vs. extracting from combined string

* feat: transform_request.tsx

working e2e raw request viewing

* fix(litellm_logging.py): fix transform viewing for bedrock models

* fix(litellm_logging.py): don't return sensitive headers in raw request headers

prevent accidental leak

* feat(transform_request.tsx): style improvements
2025-03-07 19:39:31 -08:00
Ishaan Jaff
b02af305de
[Feat] - Display thinking tokens on OpenWebUI (Bedrock, Anthropic, Deepseek) (#9029)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
* if merge_reasoning_content_in_choices

* _optional_combine_thinking_block_in_choices

* stash changes

* working merge_reasoning_content_in_choices with bedrock

* fix litellm_params accessor

* fix streaming handler

* merge_reasoning_content_in_choices

* _optional_combine_thinking_block_in_choices

* test_bedrock_stream_thinking_content_openwebui

* merge_reasoning_content_in_choices

* fix for _optional_combine_thinking_block_in_choices

* linting error fix
2025-03-06 18:32:58 -08:00
Ishaan Jaff
f47987e673
(Refactor) /v1/messages to follow simpler logic for Anthropic API spec (#9013)
* anthropic_messages_handler v0

* fix /messages

* working messages with router methods

* test_anthropic_messages_handler_litellm_router_non_streaming

* test_anthropic_messages_litellm_router_non_streaming_with_logging

* AnthropicMessagesConfig

* _handle_anthropic_messages_response_logging

* working with /v1/messages endpoint

* working /v1/messages endpoint

* refactor to use router factory function

* use aanthropic_messages

* use BaseConfig for Anthropic /v1/messages

* track api key, team on /v1/messages endpoint

* fix get_logging_payload

* BaseAnthropicMessagesTest

* align test config

* test_anthropic_messages_with_thinking

* test_anthropic_streaming_with_thinking

* fix - display anthropic url for debugging

* test_bad_request_error_handling

* test_anthropic_messages_router_streaming_with_bad_request

* fix ProxyException

* test_bad_request_error_handling_streaming

* use provider_specific_header

* test_anthropic_messages_with_extra_headers

* test_anthropic_messages_to_wildcard_model

* fix gcs pub sub test

* standard_logging_payload

* fix unit testing for anthopic /v1/messages support

* fix pass through anthropic messages api

* delete dead code

* fix anthropic pass through response

* revert change to spend tracking utils

* fix get_litellm_metadata_from_kwargs

* fix spend logs payload json

* proxy_pass_through_endpoint_tests

* TestAnthropicPassthroughBasic

* fix pass through tests

* test_async_vertex_proxy_route_api_key_auth

* _handle_anthropic_messages_response_logging

* vertex_credentials

* test_set_default_vertex_config

* test_anthropic_messages_litellm_router_non_streaming_with_logging

* test_ageneric_api_call_with_fallbacks_basic

* test__aadapter_completion
2025-03-06 00:43:08 -08:00
Krish Dholakia
f6535ae6ad
Support format param for specifying image type (#9019)
* fix(transformation.py): support a 'format' parameter for image's

allow user to specify mime type

* fix: pass mimetype via 'format' param

* feat(gemini/chat/transformation.py): support 'format' param for gemini

* fix(factory.py): support 'format' param on sync bedrock converse calls

* feat(bedrock/converse_transformation.py): support 'format' param for bedrock async calls

* refactor(factory.py): move to supporting 'format' param in base helper

ensures consistency in param support

* feat(gpt_transformation.py): filter out 'format' param

don't send invalid param to openai

* fix(gpt_transformation.py): fix translation

* fix: fix translation error
2025-03-05 19:52:53 -08:00
Krish Dholakia
ec4f665e29
Return signature on anthropic streaming + migrate to signature field instead of signature_delta [MINOR bump] (#9021)
* Fix missing signature_delta in thinking blocks when streaming from Claude 3.7 (#8797)

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* test: update test to enforce signature found

* feat(refactor-signature-param-to-be-'signature'-instead-of-'signature_delta'): keeps it in sync with anthropic

* fix: fix linting error

---------

Co-authored-by: Martin Krasser <krasserm@googlemail.com>
2025-03-05 19:33:54 -08:00
Krish Dholakia
5e386c28b2
Litellm dev 03 04 2025 p3 (#8997)
* fix(core_helpers.py): handle litellm_metadata instead of 'metadata'

* feat(batches/): ensure batches logs are written to db

makes batches response dict compatible

* fix(cost_calculator.py): handle batch response being a dictionary

* fix(batches/main.py): modify retrieve endpoints to use @client decorator

enables logging to work on retrieve call

* fix(batches/main.py): fix retrieve batch response type to be 'dict' compatible

* fix(spend_tracking_utils.py): send unique uuid for retrieve batch call type

create batch and retrieve batch share the same id

* fix(spend_tracking_utils.py): prevent duplicate retrieve batch calls from being double counted

* refactor(batches/): refactor cost tracking for batches - do it on retrieve, and within the established litellm_logging pipeline

ensures cost is always logged to db

* fix: fix linting errors

* fix: fix linting error
2025-03-04 21:58:03 -08:00
Krish Dholakia
662c59adcf
Support caching on reasoning content + other fixes (#8973)
* fix(factory.py): pass on anthropic thinking content from assistant call

* fix(factory.py): fix anthropic messages to handle thinking blocks

Fixes https://github.com/BerriAI/litellm/issues/8961

* fix(factory.py): fix bedrock handling for assistant content in messages

Fixes https://github.com/BerriAI/litellm/issues/8961

* feat(convert_dict_to_response.py): handle reasoning content + thinking blocks in chat completion block

ensures caching works for anthropic thinking block

* fix(convert_dict_to_response.py): pass all message params to delta block

ensures streaming delta also contains the reasoning content / thinking block

* test(test_prompt_factory.py): remove redundant test

anthropic now supports assistant as the first message

* fix(factory.py): fix linting errors

* fix: fix code qa

* test: remove falsy test

* fix(litellm_logging.py): fix str conversion
2025-03-04 21:12:16 -08:00
Krish Dholakia
94d28d59e4
Fix deepseek 'reasoning_content' error (#8963)
* fix(streaming_handler.py): fix deepseek reasoning content streaming

Fixes https://github.com/BerriAI/litellm/issues/8939

* test(test_streaming_handler.py): add unit test to streaming handle 'is_chunk_non_empty' function

ensures 'reasoning_content' is handled correctly
2025-03-03 14:34:10 -08:00
Ishaan Jaff
bc9b3e4847
(Bug fix) - don't log messages in model_parameters in StandardLoggingPayload (#8932)
* define model param helper

* use ModelParamHelper

* get_standard_logging_model_parameters

* fix code quality

* get_standard_logging_model_parameters

* StandardLoggingPayload

* test_get_kwargs_for_cache_key

* test_langsmith_key_based_logging

* fix code qa

* fix linting
2025-03-01 13:39:45 -08:00
Ishaan Jaff
ee7cd60fdb Revert "(bug fix) - don't log messages, prompt, input in model_parameters in StandardLoggingPayload (#8923)"
This reverts commit a119cb420b.
2025-03-01 11:05:33 -08:00
Ishaan Jaff
6fc9aa1612
(bug fix) - dd tracer, only send traces when user opts into sending dd-trace (#8928)
* fix dd tracing null tracer bug

* fix dd tracing

* fix base aws llm

* test_should_use_dd_tracer
2025-03-01 10:53:36 -08:00
Ishaan Jaff
a119cb420b
(bug fix) - don't log messages, prompt, input in model_parameters in StandardLoggingPayload (#8923)
* fix _get_model_parameters

* test litellm logging

* test litellm logging
2025-03-01 10:27:24 -08:00
Ishaan Jaff
3a086cee06
(Feat) - Show Error Logs on LiteLLM UI (#8904)
* fix test_moderations_bad_model

* use async_post_call_failure_hook

* basic logging errors in DB

* show status on ui

* show status on ui

* ui show request / response side by side

* stash fixes

* working, track raw request

* track error info in metadata

* fix showing error / request / response logs

* show traceback on error viewer

* ui with traceback of error

* fix async_post_call_failure_hook

* fix(http_parsing_utils.py): orjson can throw errors on some emoji's in text, default to json.loads

* test_get_error_information

* fix code quality

* rename proxy track cost callback test

* _should_store_errors_in_spend_logs

* feature flag error logs

* Revert "_should_store_errors_in_spend_logs"

This reverts commit 7f345df477.

* Revert "feature flag error logs"

This reverts commit 0e90c022bb.

* test_spend_logs_payload

* fix OTEL log_db_metrics

* fix import json

* fix ui linting error

* test_async_post_call_failure_hook

* test_chat_completion_bad_model_with_spend_logs

---------

Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
2025-02-28 20:10:09 -08:00
Krish Dholakia
c84b489d58
Fix bedrock passing response_format: {"type": "text"} (#8900)
* fix(converse_transformation.py): ignore type: text, value in response_format

no-op for bedrock

* fix(converse_transformation.py): handle adding response format value to tools

* fix(base_invoke_transformation.py): fix 'get_bedrock_invoke_provider' to handle cross-region-inferencing models

* test(test_bedrock_completion.py): add unit testing for bedrock invoke provider logic

* test: update test

* fix(exception_mapping_utils.py): add context window exceeded error handling for databricks provider route

* fix(fireworks_ai/): support passing tools + response_format together

* fix: cleanup

* fix(base_invoke_transformation.py): fix imports
2025-02-28 20:09:59 -08:00
Krish Dholakia
3de4209569
fix caching on main branch (#8858)
* fix(streaming_handler.py): fix is delta empty check to handle empty str

* fix(streaming_handler.py): fix delta chunk on final response
2025-02-26 19:16:34 -08:00
Krish Dholakia
ab7c4d1a0e
Litellm dev bedrock anthropic 3 7 v2 (#8843)
* feat(bedrock/converse/transformation.py): support claude-3-7-sonnet reasoning_Content transformation

Closes https://github.com/BerriAI/litellm/issues/8777

* fix(bedrock/): support returning `reasoning_content` on streaming for claude-3-7

Resolves https://github.com/BerriAI/litellm/issues/8777

* feat(bedrock/): unify converse reasoning content blocks for consistency across anthropic and bedrock

* fix(anthropic/chat/transformation.py): handle deepseek-style 'reasoning_content' extraction within transformation.py

simpler logic

* feat(bedrock/): fix streaming to return blocks in consistent format

* fix: fix linting error

* test: fix test

* feat(factory.py): fix bedrock thinking block translation on tool calling

allows passing the thinking blocks back to bedrock for tool calling

* fix(types/utils.py): don't exclude provider_specific_fields on model dump

ensures consistent responses

* fix: fix linting errors

* fix(convert_dict_to_response.py): pass reasoning_content on root

* fix: test

* fix(streaming_handler.py): add helper util for setting model id

* fix(streaming_handler.py): fix setting model id on model response stream chunk

* fix(streaming_handler.py): fix linting error

* fix(streaming_handler.py): fix linting error

* fix(types/utils.py): add provider_specific_fields to model stream response

* fix(streaming_handler.py): copy provider specific fields and add them to the root of the streaming response

* fix(streaming_handler.py): fix check

* fix: fix test

* fix(types/utils.py): ensure messages content is always openai compatible

* fix(types/utils.py): fix delta object to always be openai compatible

only introduce new params if variable exists

* test: fix bedrock nova tests

* test: skip flaky test

* test: skip flaky test in ci/cd
2025-02-26 16:05:33 -08:00
Krrish Dholakia
fcf4ea3608 build: merge squashed commit
Squashed commit of the following:

commit 6678e15381
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date:   Wed Feb 26 09:29:15 2025 -0800

    test_prompt_caching

commit bd86e0ac47
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date:   Wed Feb 26 08:57:16 2025 -0800

    test_prompt_caching

commit 2fc21ad51e
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date:   Wed Feb 26 08:13:45 2025 -0800

    test_aprompt_caching

commit d94cff55ff
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date:   Wed Feb 26 08:13:12 2025 -0800

    test_prompt_caching

commit 49c5e7811e
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date:   Wed Feb 26 07:43:53 2025 -0800

    ui new build

commit cb8d5e5917
Author: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Date:   Wed Feb 26 07:38:56 2025 -0800

    (UI) - Create Key flow for existing users (#8844)

    * working create user button

    * working create user for a key flow

    * allow searching users

    * working create user + key

    * use clear sections on create key

    * better search for users

    * fix create key

    * ui fix create key button - make it neater / cleaner

    * ui fix all keys table

commit 335ba30467
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Wed Feb 26 08:53:17 2025 -0800

    fix: fix file name

commit b8c5b31a4e
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Tue Feb 25 22:54:46 2025 -0800

    fix: fix utils

commit ac6e503461
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Feb 24 10:43:31 2025 -0800

    fix(main.py): fix openai message for assistant msg if role is missing - openai allows this

    Fixes https://github.com/BerriAI/litellm/issues/8661

commit de3989dbc5
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date:   Mon Feb 24 21:19:25 2025 -0800

    fix(get_litellm_params.py): handle no-log being passed in via kwargs

    Fixes https://github.com/BerriAI/litellm/issues/8380
2025-02-26 09:39:27 -08:00
Ishaan Jaff
7021f2f244
(Bug fix) dd-trace used by default on litellm proxy (#8817)
* fix _should_use_dd_tracer

* fix _should_use_dd_tracer

* _should_use_dd_tracer

* _should_use_dd_tracer

* _should_use_dd_tracer

* _init_dd_tracer

* _should_use_dd_tracer

* fix should use dd-tracer

* fix dd tracer
2025-02-25 19:54:22 -08:00
Krish Dholakia
142b195784
Add anthropic thinking + reasoning content support (#8778)
* feat(anthropic/chat/transformation.py): add anthropic thinking param support

* feat(anthropic/chat/transformation.py): support returning thinking content for anthropic on streaming responses

* feat(anthropic/chat/transformation.py): return list of thinking blocks (include block signature)

allows usage in tool call responses

* fix(types/utils.py): extract and map reasoning_content from anthropic as content str

* test: add testing to ensure thinking_blocks are returned at the root

* fix(anthropic/chat/handler.py): return thinking blocks on streaming - include signature

* feat(factory.py): handle anthropic thinking blocks translation if in assistant response

* test: handle openai internal instability

* test: handle openai audio instability

* ci: pin anthropic dep

* test: handle openai audio instability

* fix: fix linting error

* refactor(anthropic/chat/transformation.py): refactor function to remain <50 LOC

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error
2025-02-24 21:54:30 -08:00
Krish Dholakia
21ea52105a
Support arize phoenix on litellm proxy (#7756) (#8715)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 12s
* Update opentelemetry.py

wip

* Update test_opentelemetry_unit_tests.py

* fix a few paths and tests

* fix path

* Update litellm_logging.py

* accidentally removed code

* Add type for protocol

* Add and update tests

* minor changes

* update and add additional arize phoenix test

* update existing test

* address feedback

* use standard_logging_object

* address feedback

Co-authored-by: Nate Mar <67926244+nate-mar@users.noreply.github.com>
2025-02-22 20:55:11 -08:00
Ishaan Jaff
300d7825f5
(Observability) - Add more detailed dd tracing on Proxy Auth, Bedrock Auth (#8693)
* add dd tracer

* fix dd tracing

* add @tracer.wrap() on def user_api_key_auth

* add async_function_with_retries

* remove dead code

* add tracer.wrap on base aws llm

* add tracer.wrap on base aws llm

* fix print verbose

* fix dd tracing

* trace base aws llm

* fix test base aws llm

* fix converse transform

* test base aws llm

* BASE_AWS_LLM_PATH

* BASE_AWS_LLM_PATH

* test dd tracing
2025-02-20 18:00:41 -08:00
Ishaan Jaff
bb6f43d12e
(Bug fix) - Cache Health not working when configured with prometheus service logger (#8687)
* fix serialize on safe json dumps

* test_non_standard_dict_keys_complex

* ui fix HealthCheckCacheParams

* fix HealthCheckCacheParams

* fix code qa

* test_cache_ping_failure

* test_cache_ping_health_check_includes_only_cache_attributes

* test_cache_ping_health_check_includes_only_cache_attributes
2025-02-20 13:41:56 -08:00
Ishaan Jaff
fff15543d9
(UI + Proxy) Cache Health Check Page - Cleanup/Improvements (#8665)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
* fixes for redis cache ping serialization

* fix cache ping check

* fix cache health check ui

* working error details on ui

* ui expand / collapse error

* move cache health check to diff file

* fix displaying error from cache health check

* ui allow copying errors

* ui cache health fixes

* show redis details

* clean up cache health page

* ui polish fixes

* fix error handling on cache health page

* fix redis_cache_params on cache ping response

* error handling

* cache health ping response

* fx error response from cache ping

* parsedLitellmParams

* fix cache health check

* fix cache health page

* cache safely handle json dumps issues

* test caching routes

* test_primitive_types

* fix caching routes

* litellm_mapped_tests

* fix pytest-mock

* fix _serialize

* fix linting on safe dumps

* test_default_max_depth

* pip install "pytest-mock==3.12.0"

* litellm_mapped_tests_coverage

* add readme on new litellm test dir
2025-02-19 19:08:50 -08:00
Ishaan Jaff
2753de1458
(Bug Fix + Better Observability) - BudgetResetJob: (#8562)
* use class ResetBudgetJob

* refactor reset budget job

* update reset_budget job

* refactor reset budget job

* fix LiteLLM_UserTable

* refactor reset budget job

* add telemetry for reset budget job

* dd - log service success/failure on DD

* add detailed reset budget reset info on DD

* initialize_scheduled_background_jobs

* refactor reset budget job

* trigger service failure hook when fails to reset a budget for team, key, user

* fix resetBudgetJob

* unit testing for ResetBudgetJob

* test_duration_in_seconds_basic

* testing for triggering service logging

* fix logs on test teams fail

* remove unused imports

* fix import duration in s

* duration_in_seconds
2025-02-15 16:13:08 -08:00
Krish Dholakia
58141df65d
Litellm dev 02 13 2025 p2 (#8525)
* fix(azure/chat/gpt_transformation.py): add 'prediction' as a support azure param

Closes https://github.com/BerriAI/litellm/issues/8500

* build(model_prices_and_context_window.json): add new 'gemini-2.0-pro-exp-02-05' model

* style: cleanup invalid json trailing commma

* feat(utils.py): support passing 'tokenizer_config' to register_prompt_template

enables passing complete tokenizer config of model to litellm

 Allows calling deepseek on bedrock with the correct prompt template

* fix(utils.py): fix register_prompt_template for custom model names

* test(test_prompt_factory.py): fix test

* test(test_completion.py): add e2e test for bedrock invoke deepseek ft model

* feat(base_invoke_transformation.py): support hf_model_name param for bedrock invoke calls

enables proxy admin to set base model for ft bedrock deepseek model

* feat(bedrock/invoke): support deepseek_r1 route for bedrock

makes it easy to apply the right chat template to that call

* feat(constants.py): store deepseek r1 chat template - allow user to get correct response from deepseek r1 without extra work

* test(test_completion.py): add e2e mock test for bedrock deepseek

* docs(bedrock.md): document new deepseek_r1 route for bedrock

allows us to use the right config

* fix(exception_mapping_utils.py): catch read operation timeout
2025-02-13 20:28:42 -08:00
Krish Dholakia
8903bd1c7f
fix(utils.py): fix vertex ai optional param handling (#8477)
* fix(utils.py): fix vertex ai optional param handling

don't pass max retries to unsupported route

Fixes https://github.com/BerriAI/litellm/issues/8254

* fix(get_supported_openai_params.py): fix linting error

* fix(get_supported_openai_params.py): default to openai-like spec

* test: fix test

* fix: fix linting error

* Improved wildcard route handling on `/models` and `/model_group/info`  (#8473)

* fix(model_checks.py): update returning known model from wildcard to filter based on given model prefix

ensures wildcard route - `vertex_ai/gemini-*` just returns known vertex_ai/gemini- models

* test(test_proxy_utils.py): add unit testing for new 'get_known_models_from_wildcard' helper

* test(test_models.py): add e2e testing for `/model_group/info` endpoint

* feat(prometheus.py): support tracking total requests by user_email on prometheus

adds initial support for tracking total requests by user_email

* test(test_prometheus.py): add testing to ensure user email is always tracked

* test: update testing for new prometheus metric

* test(test_prometheus_unit_tests.py): add user email to total proxy metric

* test: update tests

* test: fix spend tests

* test: fix test

* fix(pagerduty.py): fix linting error

* (Bug fix) - Using `include_usage` for /completions requests + unit testing (#8484)

* pass stream options (#8419)

* test_completion_streaming_usage_metrics

* test_text_completion_include_usage

---------

Co-authored-by: Kaushik Deka <55996465+Kaushikdkrikhanu@users.noreply.github.com>

* fix naming docker stable release

* build(model_prices_and_context_window.json): handle azure model update

* docs(token_auth.md): clarify scopes can be a list or comma separated string

* docs: fix docs

* add sonar pricings (#8476)

* add sonar pricings

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window_backup.json

* update load testing script

* fix test_async_router_context_window_fallback

* pplx - fix supports tool choice openai param (#8496)

* fix prom check startup (#8492)

* test_async_router_context_window_fallback

* ci(config.yml): mark daily docker builds with `-nightly` (#8499)

Resolves https://github.com/BerriAI/litellm/discussions/8495

* (Redis Cluster) - Fixes for using redis cluster + pipeline (#8442)

* update RedisCluster creation

* update RedisClusterCache

* add redis ClusterCache

* update async_set_cache_pipeline

* cleanup redis cluster usage

* fix redis pipeline

* test_init_async_client_returns_same_instance

* fix redis cluster

* update mypy_path

* fix init_redis_cluster

* remove stub

* test redis commit

* ClusterPipeline

* fix import

* RedisCluster import

* fix redis cluster

* Potential fix for code scanning alert no. 2129: Clear-text logging of sensitive information

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* fix naming of redis cluster integration

* test_redis_caching_ttl_pipeline

* fix async_set_cache_pipeline

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* Litellm UI stable version 02 12 2025 (#8497)

* fix(key_management_endpoints.py): fix `/key/list` to include `return_full_object` as a top-level query param

Allows user to specify they want the keys as a list of objects

* refactor(key_list.tsx): initial refactor of key table in user dashboard

offloads key filtering logic to backend api

prevents common error of user not being able to see their keys

* fix(key_management_endpoints.py): allow internal user to query `/key/list` to see their keys

* fix(key_management_endpoints.py): add validation checks and filtering to `/key/list` endpoint

allow internal user to see their keys. not anybody else's

* fix(view_key_table.tsx): fix issue where internal user could not see default team keys

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* fix: fix linting error

* test_supports_tool_choice

* test_async_router_context_window_fallback

* fix: fix test (#8501)

* Litellm dev 02 12 2025 p1 (#8494)

* Resolves https://github.com/BerriAI/litellm/issues/6625 (#8459)

- enables no auth for SMTP

Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com>

* add sonar pricings (#8476)

* add sonar pricings

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window.json

* Update model_prices_and_context_window_backup.json

* test: fix test

---------

Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com>
Co-authored-by: Dani Regli <1daniregli@gmail.com>
Co-authored-by: Lucca Zenóbio <luccazen@gmail.com>

* test: fix test

* UI Fixes p2  (#8502)

* refactor(admin.tsx): cleanup add new admin flow

removes buggy flow. Ensures just 1 simple way to add users / update roles.

* fix(user_search_modal.tsx): ensure 'add member' button is always visible

* fix(edit_membership.tsx): ensure 'save changes' button always visible

* fix(internal_user_endpoints.py): ensure user in org can be deleted

Fixes issue where user couldn't be deleted if they were a member of an org

* fix: fix linting error

* add phoenix docs for observability integration (#8522)

* Add files via upload

* Update arize_integration.md

* Update arize_integration.md

* add Phoenix docs

* Added custom_attributes to additional_keys which can be sent to athina (#8518)

* (UI) fix log details page  (#8524)

* rollback changes to view logs page

* ui new build

* add interface for prefetch

* fix spread operation

* fix max size for request view page

* clean up table

* ui fix column on request logs page

* ui new build

* Add UI Support for Admins to Call /cache/ping and View Cache Analytics (#8475) (#8519)

* [Bug] UI: Newly created key does not display on the View Key Page (#8039)

- Fixed issue where all keys appeared blank for admin users.
- Implemented filtering of data via team settings to ensure all keys are displayed correctly.

* Fix:
- Updated the validator to allow model editing when `keyTeam.team_alias === "Default Team"`.
- Ensured other teams still follow the original validation rules.

* - added some classes in global.css
- added text wrap in output of request,response and metadata in index.tsx
- fixed styles of table in table.tsx

* - added full payload when we open single log entry
- added Combined Info Card in index.tsx

* fix: keys not showing on refresh for internal user

* merge

* main merge

* cache page

* ca remove

* terms change

* fix:places caching inside exp

---------

Signed-off-by: Regli Daniel <daniel.regli1@sanitas.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Kaushik Deka <55996465+Kaushikdkrikhanu@users.noreply.github.com>
Co-authored-by: Lucca Zenóbio <luccazen@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Dani Regli <1daniregli@gmail.com>
Co-authored-by: exiao <exiao@users.noreply.github.com>
Co-authored-by: vivek-athina <153479827+vivek-athina@users.noreply.github.com>
Co-authored-by: Taha Ali <123803932+tahaali-dev@users.noreply.github.com>
2025-02-13 19:58:50 -08:00
Krish Dholakia
f5841eb84d
fix(router.py): add more deployment timeout debug information for tim… (#8523)
* fix(router.py): add more deployment timeout debug information for timeout errors

help understand why some calls in high-traffic don't respect their model-specific timeouts

* test(test_convert_dict_to_response.py): unit test ensuring empty str is not converted to None

Addresses https://github.com/BerriAI/litellm/issues/8507

* fix(convert_dict_to_response.py): handle empty message str - don't return back as 'None'

Fixes https://github.com/BerriAI/litellm/issues/8507

* test(test_completion.py): add e2e test
2025-02-13 17:10:22 -08:00
Krish Dholakia
57e5ec07cc
Improved wildcard route handling on /models and /model_group/info (#8473)
* fix(model_checks.py): update returning known model from wildcard to filter based on given model prefix

ensures wildcard route - `vertex_ai/gemini-*` just returns known vertex_ai/gemini- models

* test(test_proxy_utils.py): add unit testing for new 'get_known_models_from_wildcard' helper

* test(test_models.py): add e2e testing for `/model_group/info` endpoint

* feat(prometheus.py): support tracking total requests by user_email on prometheus

adds initial support for tracking total requests by user_email

* test(test_prometheus.py): add testing to ensure user email is always tracked

* test: update testing for new prometheus metric

* test(test_prometheus_unit_tests.py): add user email to total proxy metric

* test: update tests

* test: fix spend tests

* test: fix test

* fix(pagerduty.py): fix linting error
2025-02-11 19:37:43 -08:00
Krish Dholakia
ce3ead6f91
Log applied guardrails on LLM API call (#8452)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 40s
* fix(litellm_logging.py): support saving applied guardrails in logging object

allows list of applied guardrails to be logged for proxy admin's knowledge

* feat(spend_tracking_utils.py): log applied guardrails to spend logs

makes it easy for admin to know what guardrails were applied on a request

* ci(config.yml): uninstall posthog from ci/cd

* test: fix tests

* test: update test
2025-02-10 22:57:30 -08:00
Ishaan Jaff
00c596a852
(Feat) - Allow viewing Request/Response Logs stored in GCS Bucket (#8449)
* BaseRequestResponseFetchFromCustomLogger

* get_active_base_request_response_fetch_from_custom_logger

* get_request_response_payload

* ui_view_request_response_for_request_id

* fix uiSpendLogDetailsCall

* fix get_request_response_payload

* ui fix RequestViewer

* use 1 class AdditionalLoggingUtils

* ui_view_request_response_for_request_id

* cache the prefetch logs details

* refactor prefetch

* test view request/resp logs

* fix code quality

* fix get_request_response_payload

* uninstall posthog
prevent it from being added in ci/cd

* fix posthog

* fix traceloop test

* fix linting error
2025-02-10 20:38:55 -08:00
Krish Dholakia
e26d7df91b
Litellm dev 02 10 2025 p2 (#8443)
* Fixed issue #8246 (#8250)

* Fixed issue #8246

* Added unit tests for discard() and for remove_callback_from_list_by_object()

* fix(openai.py): support dynamic passing of organization param to openai

handles scenario where client-side org id is passed to openai

---------

Co-authored-by: Erez Hadad <erezh@il.ibm.com>
2025-02-10 17:53:46 -08:00
Krish Dholakia
1dd3713f1a
Anthropic Citations API Support (#8382)
* test(test_anthropic_completion.py): add test ensuring anthropic structured output response is consistent

Resolves https://github.com/BerriAI/litellm/issues/8291

* feat(anthropic.py): support citations api with new user document message format

Resolves https://github.com/BerriAI/litellm/issues/7970

* fix(anthropic/chat/transformation.py): return citations as a provider-specific-field

Resolves https://github.com/BerriAI/litellm/issues/7970

* feat(anthropic/chat/handler.py): add streaming citations support

Resolves https://github.com/BerriAI/litellm/issues/7970

* fix(handler.py): fix code qa error

* fix(handler.py): only set provider specific fields if non-empty dict

* docs(anthropic.md): add citations api to anthropic docs
2025-02-07 22:27:01 -08:00
Krish Dholakia
b5850b6b65
Handle azure deepseek reasoning response (#8288) (#8366)
* Handle azure deepseek reasoning response (#8288)

* Handle deepseek reasoning response

* Add helper method + unit test

* Fix: Follow infinity api url format (#8346)

* Follow infinity api url format

* Update test_infinity.py

* fix(infinity/transformation.py): fix linting error

---------

Co-authored-by: vibhavbhat <vibhavb00@gmail.com>
Co-authored-by: Hao Shan <53949959+haoshan98@users.noreply.github.com>
2025-02-07 17:45:51 -08:00
Krish Dholakia
f651d51f26
Litellm dev 02 07 2025 p2 (#8377)
* fix(caching_routes.py): mask redis password on `/cache/ping` route

* fix(caching_routes.py): fix linting erro

* fix(caching_routes.py): fix linting error on caching routes

* fix: fix test - ignore mask_dict - has a breakpoint

* fix(azure.py): add timeout param + elapsed time in azure timeout error

* fix(http_handler.py): add elapsed time to http timeout request

makes it easier to debug how long request took before failing
2025-02-07 17:30:38 -08:00
Ishaan Jaff
b535c9bdc0
(Bug Fix - Langfuse) - fix for when model response has choices=[] (#8339)
* refactor _get_langfuse_input_output_content

* test_langfuse_logging_completion_with_malformed_llm_response

* fix _get_langfuse_input_output_content

* fixes for langfuse linting

* unit testing for get chat/text content for langfuse

* fix _should_raise_content_policy_error
2025-02-06 18:02:26 -08:00
Krish Dholakia
443ae55904
Azure OpenAI improvements - o3 native streaming, improved tool call + response format handling (#8292)
* fix(convert_dict_to_response.py): only convert if response is the response_format tool call passed in

Fixes https://github.com/BerriAI/litellm/issues/8241

* fix(gpt_transformation.py): makes sure response format / tools conversion doesn't remove previous tool calls

* refactor(gpt_transformation.py): refactor out json schema converstion to base config

keeps logic consistent across providers

* fix(o_series_transformation.py): support o3 mini native streaming

Fixes https://github.com/BerriAI/litellm/issues/8274

* fix(gpt_transformation.py): remove unused variables

* test: update test
2025-02-05 19:38:58 -08:00
Krish Dholakia
8d3a942fbd
Litellm staging (#8270)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 15s
* fix(opik.py): cleanup

* docs(opik_integration.md): cleanup opik integration docs

* fix(redact_messages.py): fix redact messages check header logic

ensures stringified bool value in header is still asserted to true

 allows dynamic message redaction

* feat(redact_messages.py): support `x-litellm-enable-message-redaction` request header

allows dynamic message redaction
2025-02-04 22:35:48 -08:00
Krish Dholakia
3c813b3a87
Fix deepseek calling - refactor to use base_llm_http_handler (#8266)
* refactor(deepseek/): move deepseek to base llm http handler

Fixes https://github.com/BerriAI/litellm/issues/8128#issuecomment-2635430457

* fix(gpt_transformation.py): support stream parsing for gpt-like calls

* test(test_deepseek_completion.py): add async streaming test

* fix(gpt_transformation.py): fix import

* fix(gpt_transformation.py): return full api base and content type
2025-02-04 22:30:00 -08:00
Krrish Dholakia
5b08289d88 fix(factory.py): fix bedrock http:// handling 2025-02-03 18:15:14 -08:00
Krish Dholakia
6834c5ecaf
Easier user onboarding via SSO (#8187)
* fix(ui_sso.py): use common `get_user_object` logic across jwt + ui sso auth

Allows finding users by their email, and attaching the sso user id to the user if found

* Improve Team Management flow on UI  (#8204)

* build(teams.tsx): refactor teams page to make it easier to add members to a team

make a row in table clickable -> allows user to add users to team they intended

* build(teams.tsx): make it clear user should click on team id to view team details

simplifies team management by putting team details on separate page

* build(team_info.tsx): separately show user id and user email

make it easy for user to understand the information they're seeing

* build(team_info.tsx): add back in 'add member' button

* build(team_info.tsx): working team member update on team_info.tsx

* build(team_info.tsx): enable team member delete on ui

allow user to delete accidental adds

* build(internal_user_endpoints.py): expose new endpoint for ui to allow filtering on user table

allows proxy admin to quickly find user they're looking for

* feat(team_endpoints.py): expose new team filter endpoint for ui

allows proxy admin to easily find team they're looking for

* feat(user_search_modal.tsx): allow admin to filter on users when adding new user to teams

* test: mark flaky test

* test: mark flaky test

* fix(exception_mapping_utils.py): fix anthropic text route error

* fix(ui_sso.py): handle situation when user not in db
2025-02-02 23:02:33 -08:00
Krish Dholakia
1105e35538
Complete o3 model support (#8183)
* fix(o_series_transformation.py): add 'reasoning_effort' as o series model param

Closes https://github.com/BerriAI/litellm/issues/8182

* fix(main.py): ensure `reasoning_effort` is a mapped openai param

* refactor(azure/): rename o1_[x] files to o_series_[x]

* refactor(base_llm_unit_tests.py): refactor testing for o series reasoning effort

* test(test_azure_o_series.py): have azure o series tests correctly inherit from base o series model tests

* feat(base_utils.py): support translating 'developer' role to 'system' role for non-openai providers

Makes it easy to switch from openai to anthropic

* fix: fix linting errors

* fix(base_llm_unit_tests.py): fix test

* fix(main.py): add missing param
2025-02-02 22:36:37 -08:00
Krish Dholakia
23f458d2da
Improved O3 + Azure O3 support (#8181)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* fix: support azure o3 model family for fake streaming workaround (#8162)

* fix: support azure o3 model family for fake streaming workaround

* refactor: rename helper to is_o_series_model for clarity

* update function calling parameters for o3 models (#8178)

* refactor(o1_transformation.py): refactor o1 config to be o series config, expand o series model check to o3

ensures max_tokens is correctly translated for o3

* feat(openai/): refactor o1 files to be 'o_series' files

expands naming to cover o3

* fix(azure/chat/o1_handler.py): azure openai is an instance of openai - was causing resets

* test(test_azure_o_series.py): assert stream faked for azure o3 mini

Resolves https://github.com/BerriAI/litellm/pull/8162

* fix(o1_transformation.py): fix o1 transformation logic to handle explicit o1_series routing

* docs(azure.md): update doc with `o_series/` model name

---------

Co-authored-by: byrongrogan <47910641+byrongrogan@users.noreply.github.com>
Co-authored-by: Low Jian Sheng <15527690+lowjiansheng@users.noreply.github.com>
2025-02-01 09:52:28 -08:00
Krish Dholakia
91ed05df29
Litellm dev contributor prs 01 31 2025 (#8168)
* Add O3-Mini for Azure and Remove Vision Support (#8161)

* Azure Released O3-mini at the same time as OAI, so i've added support here. Confirmed to work with Sweden Central.

* [FIX] replace cgi for python 3.13 with email.Message as suggested in PEP 594 (#8160)

* Update model_prices_and_context_window.json (#8120)

codestral2501 pricing on vertex_ai

* Fix/db view names (#8119)

* Fix to case sensitive DB Views name

* Fix to case sensitive DB View names

* Added quotes to check query as well

* Added quotes to create view query

* test: handle server error  for flaky test

vertex ai has unstable endpoints

---------

Co-authored-by: Wanis Elabbar <70503629+elabbarw@users.noreply.github.com>
Co-authored-by: Honghua Dong <dhh1995@163.com>
Co-authored-by: superpoussin22 <vincent.nadal@orange.fr>
Co-authored-by: Miguel Armenta <37154380+ma-armenta@users.noreply.github.com>
2025-02-01 09:05:20 -08:00
Ishaan Jaff
2cf0daa31c
(Fixes) OpenAI Streaming Token Counting + Fixes usage track when litellm.turn_off_message_logging=True (#8156)
* working streaming usage tracking

* fix test_async_chat_openai_stream_options

* fix await asyncio.sleep(1)

* test_async_chat_azure

* fix s3 logging

* fix get_stream_options

* fix get_stream_options

* fix streaming handler

* test_stream_token_counting_with_redaction

* fix codeql concern
2025-01-31 15:06:37 -08:00
Krish Dholakia
de261e2120
Doc updates + management endpoint fixes (#8138)
* Litellm dev 01 29 2025 p4 (#8107)

* fix(key_management_endpoints.py): always get db team

Fixes https://github.com/BerriAI/litellm/issues/7983

* test(test_key_management.py): add unit test enforcing check_db_only is always true on key generate checks

* test: fix test

* test: skip gemini thinking

* Litellm dev 01 29 2025 p3 (#8106)

* fix(__init__.py): reduces size of __init__.py and reduces scope for errors by using correct param

* refactor(__init__.py): refactor init by cleaning up redundant params

* refactor(__init__.py): move more constants into constants.py

cleanup root

* refactor(__init__.py): more cleanup

* feat(__init__.py): expose new 'disable_hf_tokenizer_download' param

enables hf model usage in offline env

* docs(config_settings.md): document new disable_hf_tokenizer_download param

* fix: fix linting error

* fix: fix unsafe comparison

* test: fix test

* docs(public_teams.md): add doc showing how to expose public teams for users to join

* docs: add beta disclaimer on public teams

* test: update tests
2025-01-30 22:56:41 -08:00
Krish Dholakia
69a6da4727
Litellm dev 01 30 2025 p2 (#8134)
* feat(lowest_tpm_rpm_v2.py): fix redis cache check to use >= instead of >

makes it consistent

* test(test_custom_guardrails.py): add more unit testing on default on guardrails

ensure it runs if user sent guardrail list is empty

* docs(quick_start.md): clarify default on guardrails run even if user guardrails list contains other guardrails

* refactor(litellm_logging.py): refactor no-log to helper util

allows for more consistent behavior

* feat(litellm_logging.py): add event hook to verbose logs

* fix(litellm_logging.py): add unit testing to ensure `litellm.disable_no_log_param` is respected

* docs(logging.md): document how to disable 'no-log' param

* test: fix test to handle feb

* test: cleanup old bedrock model

* fix: fix router check
2025-01-30 22:18:53 -08:00