Commit graph

425 commits

Author SHA1 Message Date
Krrish Dholakia
a5b497667c fix(logging_utils.py): revert change 2025-03-16 21:04:41 -07:00
Krrish Dholakia
a99251a4ab fix(streaming_handler.py): raise stop iteration post-finish reason 2025-03-16 20:40:41 -07:00
Krrish Dholakia
bde9ae8a95 fix(litellm_logging.py): remove unused import 2025-03-16 20:24:27 -07:00
Krrish Dholakia
c0a76427d2 fix(streaming_handler.py): pass complete streaming response on completion 2025-03-16 20:22:12 -07:00
Krrish Dholakia
08b297230e fix(streaming_handler.py): return model response on finished chunk 2025-03-16 13:05:46 -07:00
Krrish Dholakia
612d5a284d refactor(litellm_logging.py): delegate returning a complete response to the streaming_handler
Removes incorrect logic for calculating complete streaming response from litellm logging
2025-03-15 09:55:33 -07:00
Krrish Dholakia
dd2c980d5b fix(utils.py): Prevents final chunk w/ usage from being ignored
Fixes https://github.com/BerriAI/litellm/issues/7112
2025-03-15 09:12:14 -07:00
Krrish Dholakia
a9dceacc1b fix(factory.py): reduce ollama pt LOC < 50 2025-03-14 21:10:05 -07:00
Krish Dholakia
59fd58643b
Merge pull request #9261 from briandevvn/fix_ollama_pt
Fix "system" role has become unacceptable in ollama
2025-03-14 20:13:28 -07:00
Krrish Dholakia
f089b1e23f feat(endpoints.py): support adding credentials by model id
Allows user to reuse existing model credentials
2025-03-14 12:32:32 -07:00
Krrish Dholakia
605a4d1121 feat(endpoints.py): enable retrieving existing credentials by model name
Enables reusing existing credentials
2025-03-14 12:02:50 -07:00
Brian Dev
12db28b0af Support 'system' role ollama 2025-03-15 00:55:18 +07:00
Ishaan Jaff
276a7089df
Merge pull request #9220 from BerriAI/litellm_qa_responses_api
[Fixes] Responses API - allow /responses and subpaths as LLM API route + Add exception mapping for responses API
2025-03-13 21:36:59 -07:00
Ishaan Jaff
7827c275ba exception_type 2025-03-13 20:09:32 -07:00
Sunny Wan
f9a5109203
Merge branch 'BerriAI:main' into main 2025-03-13 19:37:22 -04:00
Ishaan Jaff
15d618f5b1 Add exception mapping for responses API 2025-03-13 15:57:58 -07:00
Ishaan Jaff
1ee6b7852f fix exception_type 2025-03-13 15:33:17 -07:00
Krish Dholakia
cff1c1f7d8
Merge branch 'main' into litellm_dev_03_12_2025_p1 2025-03-12 22:14:02 -07:00
Krrish Dholakia
52926408cd feat(credential_accessor.py): fix upserting new credentials via accessor 2025-03-12 19:03:37 -07:00
Krrish Dholakia
738c0b873d fix(azure_ai/transformation.py): support passing api version to azure ai services endpoint
Fixes https://github.com/BerriAI/litellm/issues/7275
2025-03-12 15:16:42 -07:00
Krish Dholakia
2d957a0ed9
Merge branch 'main' into litellm_dev_03_10_2025_p3 2025-03-12 14:56:01 -07:00
Ishaan Jaff
c2dbcb798f working streaming logging + cost tracking 2025-03-12 07:27:53 -07:00
Ishaan Jaff
46bc76d3e6 _get_assembled_streaming_response 2025-03-12 07:21:03 -07:00
Ishaan Jaff
122c11d346 revert to older logging implementation 2025-03-12 07:14:36 -07:00
Ishaan Jaff
fde75a068a working streaming logging 2025-03-12 00:02:39 -07:00
Ishaan Jaff
51dc24a405 _transform_response_api_usage_to_chat_usage 2025-03-11 22:26:44 -07:00
Ishaan Jaff
24cb83b0e4 Response API cost tracking 2025-03-11 22:02:14 -07:00
Krrish Dholakia
9af73f339a test: fix tests 2025-03-11 17:42:36 -07:00
Krrish Dholakia
152bc67d22 refactor(azure.py): working azure client init on audio speech endpoint 2025-03-11 14:19:45 -07:00
Krrish Dholakia
92881ee79e fix: fix linting error 2025-03-10 21:22:00 -07:00
Krrish Dholakia
f56c5ca380 feat: working e2e credential management - support reusing existing credentials 2025-03-10 19:29:24 -07:00
Krrish Dholakia
fdd5ba3084 feat(credential_accessor.py): support loading in credentials from credential_list
Resolves https://github.com/BerriAI/litellm/issues/9114
2025-03-10 17:15:58 -07:00
Krrish Dholakia
bfbe26b91d feat(azure.py): add azure bad request error support 2025-03-10 15:59:06 -07:00
Krrish Dholakia
5f87dc229a feat(openai.py): bubble all error information back to client 2025-03-10 15:27:43 -07:00
Krish Dholakia
f899b828cf
Support openrouter reasoning_content on streaming (#9094)
* feat(convert_dict_to_response.py): support openrouter format of reasoning content

* fix(transformation.py): fix openrouter streaming with reasoning content

Fixes https://github.com/BerriAI/litellm/issues/8193#issuecomment-270892962

* fix: fix type error
2025-03-09 20:03:59 -07:00
Krish Dholakia
e00d4fb18c
Litellm dev 03 08 2025 p3 (#9089)
* feat(ollama_chat.py): pass down http client to ollama_chat

enables easier testing

* fix(factory.py): fix passing images to ollama's `/api/generate` endpoint

Fixes https://github.com/BerriAI/litellm/issues/6683

* fix(factory.py): fix ollama pt to handle templating correctly
2025-03-09 18:20:56 -07:00
Krish Dholakia
4330ef8e81
Fix batches api cost tracking + Log batch models in spend logs / standard logging payload (#9077)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 42s
* feat(batches/): fix batch cost calculation - ensure it's accurate

use the correct cost value - prev. defaulting to non-batch cost

* feat(batch_utils.py): log batch models to spend logs + standard logging payload

makes it easy to understand how cost was calculated

* fix: fix stored payload for test

* test: fix test
2025-03-08 11:47:25 -08:00
Ishaan Jaff
e2d612efd9
Bug fix - String data: stripped from entire content in streamed Gemini responses (#9070)
* _strip_sse_data_from_chunk

* use _strip_sse_data_from_chunk

* use _strip_sse_data_from_chunk

* use _strip_sse_data_from_chunk

* _strip_sse_data_from_chunk

* test_strip_sse_data_from_chunk

* _strip_sse_data_from_chunk

* testing

* _strip_sse_data_from_chunk
2025-03-07 21:06:39 -08:00
Krish Dholakia
0e3caf92b9
UI - new API Playground for testing LiteLLM translation (#9073)
* feat: initial commit - enable dev to see translated request

* feat(utils.py): expose new endpoint - `/utils/transform_request` to see the raw request sent by litellm

* feat(transform_request.tsx): allow user to see their transformed request

* refactor(litellm_logging.py): return raw request in 3 parts - api_base, headers, request body

easier to render each individually on UI vs. extracting from combined string

* feat: transform_request.tsx

working e2e raw request viewing

* fix(litellm_logging.py): fix transform viewing for bedrock models

* fix(litellm_logging.py): don't return sensitive headers in raw request headers

prevent accidental leak

* feat(transform_request.tsx): style improvements
2025-03-07 19:39:31 -08:00
Ishaan Jaff
b02af305de
[Feat] - Display thinking tokens on OpenWebUI (Bedrock, Anthropic, Deepseek) (#9029)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 14s
* if merge_reasoning_content_in_choices

* _optional_combine_thinking_block_in_choices

* stash changes

* working merge_reasoning_content_in_choices with bedrock

* fix litellm_params accessor

* fix streaming handler

* merge_reasoning_content_in_choices

* _optional_combine_thinking_block_in_choices

* test_bedrock_stream_thinking_content_openwebui

* merge_reasoning_content_in_choices

* fix for _optional_combine_thinking_block_in_choices

* linting error fix
2025-03-06 18:32:58 -08:00
Ishaan Jaff
f47987e673
(Refactor) /v1/messages to follow simpler logic for Anthropic API spec (#9013)
* anthropic_messages_handler v0

* fix /messages

* working messages with router methods

* test_anthropic_messages_handler_litellm_router_non_streaming

* test_anthropic_messages_litellm_router_non_streaming_with_logging

* AnthropicMessagesConfig

* _handle_anthropic_messages_response_logging

* working with /v1/messages endpoint

* working /v1/messages endpoint

* refactor to use router factory function

* use aanthropic_messages

* use BaseConfig for Anthropic /v1/messages

* track api key, team on /v1/messages endpoint

* fix get_logging_payload

* BaseAnthropicMessagesTest

* align test config

* test_anthropic_messages_with_thinking

* test_anthropic_streaming_with_thinking

* fix - display anthropic url for debugging

* test_bad_request_error_handling

* test_anthropic_messages_router_streaming_with_bad_request

* fix ProxyException

* test_bad_request_error_handling_streaming

* use provider_specific_header

* test_anthropic_messages_with_extra_headers

* test_anthropic_messages_to_wildcard_model

* fix gcs pub sub test

* standard_logging_payload

* fix unit testing for anthopic /v1/messages support

* fix pass through anthropic messages api

* delete dead code

* fix anthropic pass through response

* revert change to spend tracking utils

* fix get_litellm_metadata_from_kwargs

* fix spend logs payload json

* proxy_pass_through_endpoint_tests

* TestAnthropicPassthroughBasic

* fix pass through tests

* test_async_vertex_proxy_route_api_key_auth

* _handle_anthropic_messages_response_logging

* vertex_credentials

* test_set_default_vertex_config

* test_anthropic_messages_litellm_router_non_streaming_with_logging

* test_ageneric_api_call_with_fallbacks_basic

* test__aadapter_completion
2025-03-06 00:43:08 -08:00
Krish Dholakia
f6535ae6ad
Support format param for specifying image type (#9019)
* fix(transformation.py): support a 'format' parameter for image's

allow user to specify mime type

* fix: pass mimetype via 'format' param

* feat(gemini/chat/transformation.py): support 'format' param for gemini

* fix(factory.py): support 'format' param on sync bedrock converse calls

* feat(bedrock/converse_transformation.py): support 'format' param for bedrock async calls

* refactor(factory.py): move to supporting 'format' param in base helper

ensures consistency in param support

* feat(gpt_transformation.py): filter out 'format' param

don't send invalid param to openai

* fix(gpt_transformation.py): fix translation

* fix: fix translation error
2025-03-05 19:52:53 -08:00
Krish Dholakia
ec4f665e29
Return signature on anthropic streaming + migrate to signature field instead of signature_delta [MINOR bump] (#9021)
* Fix missing signature_delta in thinking blocks when streaming from Claude 3.7 (#8797)

Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* test: update test to enforce signature found

* feat(refactor-signature-param-to-be-'signature'-instead-of-'signature_delta'): keeps it in sync with anthropic

* fix: fix linting error

---------

Co-authored-by: Martin Krasser <krasserm@googlemail.com>
2025-03-05 19:33:54 -08:00
Krish Dholakia
5e386c28b2
Litellm dev 03 04 2025 p3 (#8997)
* fix(core_helpers.py): handle litellm_metadata instead of 'metadata'

* feat(batches/): ensure batches logs are written to db

makes batches response dict compatible

* fix(cost_calculator.py): handle batch response being a dictionary

* fix(batches/main.py): modify retrieve endpoints to use @client decorator

enables logging to work on retrieve call

* fix(batches/main.py): fix retrieve batch response type to be 'dict' compatible

* fix(spend_tracking_utils.py): send unique uuid for retrieve batch call type

create batch and retrieve batch share the same id

* fix(spend_tracking_utils.py): prevent duplicate retrieve batch calls from being double counted

* refactor(batches/): refactor cost tracking for batches - do it on retrieve, and within the established litellm_logging pipeline

ensures cost is always logged to db

* fix: fix linting errors

* fix: fix linting error
2025-03-04 21:58:03 -08:00
Krish Dholakia
662c59adcf
Support caching on reasoning content + other fixes (#8973)
* fix(factory.py): pass on anthropic thinking content from assistant call

* fix(factory.py): fix anthropic messages to handle thinking blocks

Fixes https://github.com/BerriAI/litellm/issues/8961

* fix(factory.py): fix bedrock handling for assistant content in messages

Fixes https://github.com/BerriAI/litellm/issues/8961

* feat(convert_dict_to_response.py): handle reasoning content + thinking blocks in chat completion block

ensures caching works for anthropic thinking block

* fix(convert_dict_to_response.py): pass all message params to delta block

ensures streaming delta also contains the reasoning content / thinking block

* test(test_prompt_factory.py): remove redundant test

anthropic now supports assistant as the first message

* fix(factory.py): fix linting errors

* fix: fix code qa

* test: remove falsy test

* fix(litellm_logging.py): fix str conversion
2025-03-04 21:12:16 -08:00
Sunny Wan
f2c2266fd7
Merge branch 'BerriAI:main' into main 2025-03-03 21:37:43 -05:00
Sunny Wan
bdd03405fe Removed unnecessary comments 2025-03-03 18:18:24 -05:00
Krish Dholakia
94d28d59e4
Fix deepseek 'reasoning_content' error (#8963)
* fix(streaming_handler.py): fix deepseek reasoning content streaming

Fixes https://github.com/BerriAI/litellm/issues/8939

* test(test_streaming_handler.py): add unit test to streaming handle 'is_chunk_non_empty' function

ensures 'reasoning_content' is handled correctly
2025-03-03 14:34:10 -08:00
Sunny Wan
fd090c8043 [FEAT] Added snowflake completion provider 2025-03-03 01:20:00 -05:00
Ishaan Jaff
bc9b3e4847
(Bug fix) - don't log messages in model_parameters in StandardLoggingPayload (#8932)
* define model param helper

* use ModelParamHelper

* get_standard_logging_model_parameters

* fix code quality

* get_standard_logging_model_parameters

* StandardLoggingPayload

* test_get_kwargs_for_cache_key

* test_langsmith_key_based_logging

* fix code qa

* fix linting
2025-03-01 13:39:45 -08:00