Commit graph

425 commits

Author SHA1 Message Date
Krrish Dholakia
01fc7f4175 fix(logging_utils.py): revert change 2025-03-16 21:04:41 -07:00
Krrish Dholakia
85cf702deb fix(streaming_handler.py): raise stop iteration post-finish reason 2025-03-16 20:40:41 -07:00
Krrish Dholakia
08e73d66a1 fix(litellm_logging.py): remove unused import 2025-03-16 20:24:27 -07:00
Krrish Dholakia
7446038d26 fix(streaming_handler.py): pass complete streaming response on completion 2025-03-16 20:22:12 -07:00
Krrish Dholakia
4d3f4b31d1 fix(streaming_handler.py): return model response on finished chunk 2025-03-16 13:05:46 -07:00
Krrish Dholakia
82320a9b65 refactor(litellm_logging.py): delegate returning a complete response to the streaming_handler
Removes incorrect logic for calculating complete streaming response from litellm logging
2025-03-15 09:55:33 -07:00
Krrish Dholakia
424f51cc06 fix(utils.py): Prevents final chunk w/ usage from being ignored
Fixes https://github.com/BerriAI/litellm/issues/7112
2025-03-15 09:12:14 -07:00
Krrish Dholakia
d818530265 fix(factory.py): reduce ollama pt LOC < 50 2025-03-14 21:10:05 -07:00
Krish Dholakia
7b189e3085 Merge pull request #9261 from briandevvn/fix_ollama_pt
Fix "system" role has become unacceptable in ollama
2025-03-14 20:13:28 -07:00
Krrish Dholakia
b75cd3b887 feat(endpoints.py): support adding credentials by model id
Allows user to reuse existing model credentials
2025-03-14 12:32:32 -07:00
Krrish Dholakia
913dc5b73b feat(endpoints.py): enable retrieving existing credentials by model name
Enables reusing existing credentials
2025-03-14 12:02:50 -07:00
Brian Dev
f53e365170 Support 'system' role ollama 2025-03-15 00:55:18 +07:00
Ishaan Jaff
ceb8668e4a Merge pull request #9220 from BerriAI/litellm_qa_responses_api
[Fixes] Responses API - allow /responses and subpaths as LLM API route + Add exception mapping for responses API
2025-03-13 21:36:59 -07:00
Ishaan Jaff
a6e04aeffb exception_type 2025-03-13 20:09:32 -07:00
Sunny Wan
e01d12b878 Merge branch 'BerriAI:main' into main 2025-03-13 19:37:22 -04:00
Ishaan Jaff
c2ed7add37 Add exception mapping for responses API 2025-03-13 15:57:58 -07:00
Ishaan Jaff
acdc2d8266 fix exception_type 2025-03-13 15:33:17 -07:00
Krish Dholakia
72f92853e0 Merge branch 'main' into litellm_dev_03_12_2025_p1 2025-03-12 22:14:02 -07:00
Krrish Dholakia
d024a5d703 feat(credential_accessor.py): fix upserting new credentials via accessor 2025-03-12 19:03:37 -07:00
Krrish Dholakia
c76cf6ad6c fix(azure_ai/transformation.py): support passing api version to azure ai services endpoint
Fixes https://github.com/BerriAI/litellm/issues/7275
2025-03-12 15:16:42 -07:00
Krish Dholakia
103b3cb574 Merge branch 'main' into litellm_dev_03_10_2025_p3 2025-03-12 14:56:01 -07:00
Ishaan Jaff
2a68c56bc0 working streaming logging + cost tracking 2025-03-12 07:27:53 -07:00
Ishaan Jaff
c1b9c4cc7b _get_assembled_streaming_response 2025-03-12 07:21:03 -07:00
Ishaan Jaff
a3d1e39164 revert to older logging implementation 2025-03-12 07:14:36 -07:00
Ishaan Jaff
0dc5e784f5 working streaming logging 2025-03-12 00:02:39 -07:00
Ishaan Jaff
fddc1d4186 _transform_response_api_usage_to_chat_usage 2025-03-11 22:26:44 -07:00
Ishaan Jaff
d6ea064ebe Response API cost tracking 2025-03-11 22:02:14 -07:00
Krrish Dholakia
934c06c207 test: fix tests 2025-03-11 17:42:36 -07:00
Krrish Dholakia
4f4507ccc0 refactor(azure.py): working azure client init on audio speech endpoint 2025-03-11 14:19:45 -07:00
Krrish Dholakia
e58c18611f fix: fix linting error 2025-03-10 21:22:00 -07:00
Krrish Dholakia
a87f822c50 feat: working e2e credential management - support reusing existing credentials 2025-03-10 19:29:24 -07:00
Krrish Dholakia
e518e3558b feat(credential_accessor.py): support loading in credentials from credential_list
Resolves https://github.com/BerriAI/litellm/issues/9114
2025-03-10 17:15:58 -07:00
Krrish Dholakia
5d1b5f94c6 feat(azure.py): add azure bad request error support 2025-03-10 15:59:06 -07:00
Krrish Dholakia
cc0606b38d feat(openai.py): bubble all error information back to client 2025-03-10 15:27:43 -07:00
Krish Dholakia
b401f2c06f Support openrouter reasoning_content on streaming (#9094)
* feat(convert_dict_to_response.py): support openrouter format of reasoning content

* fix(transformation.py): fix openrouter streaming with reasoning content

Fixes https://github.com/BerriAI/litellm/issues/8193#issuecomment-270892962

* fix: fix type error
2025-03-09 20:03:59 -07:00
Krish Dholakia
91b41204fb Litellm dev 03 08 2025 p3 (#9089)
* feat(ollama_chat.py): pass down http client to ollama_chat

enables easier testing

* fix(factory.py): fix passing images to ollama's `/api/generate` endpoint

Fixes https://github.com/BerriAI/litellm/issues/6683

* fix(factory.py): fix ollama pt to handle templating correctly
2025-03-09 18:20:56 -07:00
Krish Dholakia
44c9eef64f Fix batches api cost tracking + Log batch models in spend logs / standard logging payload (#9077)
* feat(batches/): fix batch cost calculation - ensure it's accurate

use the correct cost value - prev. defaulting to non-batch cost

* feat(batch_utils.py): log batch models to spend logs + standard logging payload

makes it easy to understand how cost was calculated

* fix: fix stored payload for test

* test: fix test
2025-03-08 11:47:25 -08:00
Ishaan Jaff
571ef58045 Bug fix - String data: stripped from entire content in streamed Gemini responses (#9070)
* _strip_sse_data_from_chunk

* use _strip_sse_data_from_chunk

* use _strip_sse_data_from_chunk

* use _strip_sse_data_from_chunk

* _strip_sse_data_from_chunk

* test_strip_sse_data_from_chunk

* _strip_sse_data_from_chunk

* testing

* _strip_sse_data_from_chunk
2025-03-07 21:06:39 -08:00
Krish Dholakia
9fc7bd0493 UI - new API Playground for testing LiteLLM translation (#9073)
* feat: initial commit - enable dev to see translated request

* feat(utils.py): expose new endpoint - `/utils/transform_request` to see the raw request sent by litellm

* feat(transform_request.tsx): allow user to see their transformed request

* refactor(litellm_logging.py): return raw request in 3 parts - api_base, headers, request body

easier to render each individually on UI vs. extracting from combined string

* feat: transform_request.tsx

working e2e raw request viewing

* fix(litellm_logging.py): fix transform viewing for bedrock models

* fix(litellm_logging.py): don't return sensitive headers in raw request headers

prevent accidental leak

* feat(transform_request.tsx): style improvements
2025-03-07 19:39:31 -08:00
Ishaan Jaff
bfbbac38fc [Feat] - Display thinking tokens on OpenWebUI (Bedrock, Anthropic, Deepseek) (#9029)
* if merge_reasoning_content_in_choices

* _optional_combine_thinking_block_in_choices

* stash changes

* working merge_reasoning_content_in_choices with bedrock

* fix litellm_params accessor

* fix streaming handler

* merge_reasoning_content_in_choices

* _optional_combine_thinking_block_in_choices

* test_bedrock_stream_thinking_content_openwebui

* merge_reasoning_content_in_choices

* fix for _optional_combine_thinking_block_in_choices

* linting error fix
2025-03-06 18:32:58 -08:00
Ishaan Jaff
84a83f8c51 (Refactor) /v1/messages to follow simpler logic for Anthropic API spec (#9013)
* anthropic_messages_handler v0

* fix /messages

* working messages with router methods

* test_anthropic_messages_handler_litellm_router_non_streaming

* test_anthropic_messages_litellm_router_non_streaming_with_logging

* AnthropicMessagesConfig

* _handle_anthropic_messages_response_logging

* working with /v1/messages endpoint

* working /v1/messages endpoint

* refactor to use router factory function

* use aanthropic_messages

* use BaseConfig for Anthropic /v1/messages

* track api key, team on /v1/messages endpoint

* fix get_logging_payload

* BaseAnthropicMessagesTest

* align test config

* test_anthropic_messages_with_thinking

* test_anthropic_streaming_with_thinking

* fix - display anthropic url for debugging

* test_bad_request_error_handling

* test_anthropic_messages_router_streaming_with_bad_request

* fix ProxyException

* test_bad_request_error_handling_streaming

* use provider_specific_header

* test_anthropic_messages_with_extra_headers

* test_anthropic_messages_to_wildcard_model

* fix gcs pub sub test

* standard_logging_payload

* fix unit testing for anthopic /v1/messages support

* fix pass through anthropic messages api

* delete dead code

* fix anthropic pass through response

* revert change to spend tracking utils

* fix get_litellm_metadata_from_kwargs

* fix spend logs payload json

* proxy_pass_through_endpoint_tests

* TestAnthropicPassthroughBasic

* fix pass through tests

* test_async_vertex_proxy_route_api_key_auth

* _handle_anthropic_messages_response_logging

* vertex_credentials

* test_set_default_vertex_config

* test_anthropic_messages_litellm_router_non_streaming_with_logging

* test_ageneric_api_call_with_fallbacks_basic

* test__aadapter_completion
2025-03-06 00:43:08 -08:00
Krish Dholakia
5bd6e93ac7 Support format param for specifying image type (#9019)
* fix(transformation.py): support a 'format' parameter for image's

allow user to specify mime type

* fix: pass mimetype via 'format' param

* feat(gemini/chat/transformation.py): support 'format' param for gemini

* fix(factory.py): support 'format' param on sync bedrock converse calls

* feat(bedrock/converse_transformation.py): support 'format' param for bedrock async calls

* refactor(factory.py): move to supporting 'format' param in base helper

ensures consistency in param support

* feat(gpt_transformation.py): filter out 'format' param

don't send invalid param to openai

* fix(gpt_transformation.py): fix translation

* fix: fix translation error
2025-03-05 19:52:53 -08:00
Krish Dholakia
dd79bc000d Return signature on anthropic streaming + migrate to signature field instead of signature_delta [MINOR bump] (#9021)
* Fix missing signature_delta in thinking blocks when streaming from Claude 3.7 (#8797)

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* test: update test to enforce signature found

* feat(refactor-signature-param-to-be-'signature'-instead-of-'signature_delta'): keeps it in sync with anthropic

* fix: fix linting error

---------

Co-authored-by: Martin Krasser <krasserm@googlemail.com>
2025-03-05 19:33:54 -08:00
Krish Dholakia
b43b8dc21c Litellm dev 03 04 2025 p3 (#8997)
* fix(core_helpers.py): handle litellm_metadata instead of 'metadata'

* feat(batches/): ensure batches logs are written to db

makes batches response dict compatible

* fix(cost_calculator.py): handle batch response being a dictionary

* fix(batches/main.py): modify retrieve endpoints to use @client decorator

enables logging to work on retrieve call

* fix(batches/main.py): fix retrieve batch response type to be 'dict' compatible

* fix(spend_tracking_utils.py): send unique uuid for retrieve batch call type

create batch and retrieve batch share the same id

* fix(spend_tracking_utils.py): prevent duplicate retrieve batch calls from being double counted

* refactor(batches/): refactor cost tracking for batches - do it on retrieve, and within the established litellm_logging pipeline

ensures cost is always logged to db

* fix: fix linting errors

* fix: fix linting error
2025-03-04 21:58:03 -08:00
Krish Dholakia
252c10ad08 Support caching on reasoning content + other fixes (#8973)
* fix(factory.py): pass on anthropic thinking content from assistant call

* fix(factory.py): fix anthropic messages to handle thinking blocks

Fixes https://github.com/BerriAI/litellm/issues/8961

* fix(factory.py): fix bedrock handling for assistant content in messages

Fixes https://github.com/BerriAI/litellm/issues/8961

* feat(convert_dict_to_response.py): handle reasoning content + thinking blocks in chat completion block

ensures caching works for anthropic thinking block

* fix(convert_dict_to_response.py): pass all message params to delta block

ensures streaming delta also contains the reasoning content / thinking block

* test(test_prompt_factory.py): remove redundant test

anthropic now supports assistant as the first message

* fix(factory.py): fix linting errors

* fix: fix code qa

* test: remove falsy test

* fix(litellm_logging.py): fix str conversion
2025-03-04 21:12:16 -08:00
Sunny Wan
4e4ad41de8 Merge branch 'BerriAI:main' into main 2025-03-03 21:37:43 -05:00
Sunny Wan
8333392db5 Removed unnecessary comments 2025-03-03 18:18:24 -05:00
Krish Dholakia
159d5ddaf5 Fix deepseek 'reasoning_content' error (#8963)
* fix(streaming_handler.py): fix deepseek reasoning content streaming

Fixes https://github.com/BerriAI/litellm/issues/8939

* test(test_streaming_handler.py): add unit test to streaming handle 'is_chunk_non_empty' function

ensures 'reasoning_content' is handled correctly
2025-03-03 14:34:10 -08:00
Sunny Wan
c625b3fe0d [FEAT] Added snowflake completion provider 2025-03-03 01:20:00 -05:00
Ishaan Jaff
ab3f0d8873 (Bug fix) - don't log messages in model_parameters in StandardLoggingPayload (#8932)
* define model param helper

* use ModelParamHelper

* get_standard_logging_model_parameters

* fix code quality

* get_standard_logging_model_parameters

* StandardLoggingPayload

* test_get_kwargs_for_cache_key

* test_langsmith_key_based_logging

* fix code qa

* fix linting
2025-03-01 13:39:45 -08:00