Krish Dholakia
cd5024f3b1
Merge pull request #9333 from BerriAI/litellm_dev_03_17_2025_p2
...
fix(ollama/completions/transformation.py): pass prompt, untemplated o…
2025-03-17 21:48:30 -07:00
Krrish Dholakia
22faf7d232
fix(ollama/completions/transformation.py): pass prompt, untemplated on /completions
request
...
Fixes https://github.com/BerriAI/litellm/issues/6900
2025-03-17 18:35:44 -07:00
Krrish Dholakia
c4b2e0ae3d
fix(streaming_handler.py): support logging complete streaming response on cache hit
2025-03-17 18:10:39 -07:00
Krrish Dholakia
dd9e79adbd
fix(streaming_handler.py): emit deep copy of completed chunk
2025-03-17 17:26:21 -07:00
Krrish Dholakia
a5b497667c
fix(logging_utils.py): revert change
2025-03-16 21:04:41 -07:00
Krrish Dholakia
a99251a4ab
fix(streaming_handler.py): raise stop iteration post-finish reason
2025-03-16 20:40:41 -07:00
Krrish Dholakia
bde9ae8a95
fix(litellm_logging.py): remove unused import
2025-03-16 20:24:27 -07:00
Krrish Dholakia
c0a76427d2
fix(streaming_handler.py): pass complete streaming response on completion
2025-03-16 20:22:12 -07:00
Krrish Dholakia
08b297230e
fix(streaming_handler.py): return model response on finished chunk
2025-03-16 13:05:46 -07:00
Krrish Dholakia
612d5a284d
refactor(litellm_logging.py): delegate returning a complete response to the streaming_handler
...
Removes incorrect logic for calculating complete streaming response from litellm logging
2025-03-15 09:55:33 -07:00
Krrish Dholakia
dd2c980d5b
fix(utils.py): Prevents final chunk w/ usage from being ignored
...
Fixes https://github.com/BerriAI/litellm/issues/7112
2025-03-15 09:12:14 -07:00
Krrish Dholakia
a9dceacc1b
fix(factory.py): reduce ollama pt LOC < 50
2025-03-14 21:10:05 -07:00
Krish Dholakia
59fd58643b
Merge pull request #9261 from briandevvn/fix_ollama_pt
...
Fix "system" role has become unacceptable in ollama
2025-03-14 20:13:28 -07:00
Krrish Dholakia
f089b1e23f
feat(endpoints.py): support adding credentials by model id
...
Allows user to reuse existing model credentials
2025-03-14 12:32:32 -07:00
Krrish Dholakia
605a4d1121
feat(endpoints.py): enable retrieving existing credentials by model name
...
Enables reusing existing credentials
2025-03-14 12:02:50 -07:00
Brian Dev
12db28b0af
Support 'system' role ollama
2025-03-15 00:55:18 +07:00
Ishaan Jaff
276a7089df
Merge pull request #9220 from BerriAI/litellm_qa_responses_api
...
[Fixes] Responses API - allow /responses and subpaths as LLM API route + Add exception mapping for responses API
2025-03-13 21:36:59 -07:00
Ishaan Jaff
7827c275ba
exception_type
2025-03-13 20:09:32 -07:00
Sunny Wan
f9a5109203
Merge branch 'BerriAI:main' into main
2025-03-13 19:37:22 -04:00
Ishaan Jaff
15d618f5b1
Add exception mapping for responses API
2025-03-13 15:57:58 -07:00
Ishaan Jaff
1ee6b7852f
fix exception_type
2025-03-13 15:33:17 -07:00
Krish Dholakia
cff1c1f7d8
Merge branch 'main' into litellm_dev_03_12_2025_p1
2025-03-12 22:14:02 -07:00
Krrish Dholakia
52926408cd
feat(credential_accessor.py): fix upserting new credentials via accessor
2025-03-12 19:03:37 -07:00
Krrish Dholakia
738c0b873d
fix(azure_ai/transformation.py): support passing api version to azure ai services endpoint
...
Fixes https://github.com/BerriAI/litellm/issues/7275
2025-03-12 15:16:42 -07:00
Krish Dholakia
2d957a0ed9
Merge branch 'main' into litellm_dev_03_10_2025_p3
2025-03-12 14:56:01 -07:00
Ishaan Jaff
c2dbcb798f
working streaming logging + cost tracking
2025-03-12 07:27:53 -07:00
Ishaan Jaff
46bc76d3e6
_get_assembled_streaming_response
2025-03-12 07:21:03 -07:00
Ishaan Jaff
122c11d346
revert to older logging implementation
2025-03-12 07:14:36 -07:00
Ishaan Jaff
fde75a068a
working streaming logging
2025-03-12 00:02:39 -07:00
Ishaan Jaff
51dc24a405
_transform_response_api_usage_to_chat_usage
2025-03-11 22:26:44 -07:00
Ishaan Jaff
24cb83b0e4
Response API cost tracking
2025-03-11 22:02:14 -07:00
Krrish Dholakia
9af73f339a
test: fix tests
2025-03-11 17:42:36 -07:00
Krrish Dholakia
152bc67d22
refactor(azure.py): working azure client init on audio speech endpoint
2025-03-11 14:19:45 -07:00
Krrish Dholakia
92881ee79e
fix: fix linting error
2025-03-10 21:22:00 -07:00
Krrish Dholakia
f56c5ca380
feat: working e2e credential management - support reusing existing credentials
2025-03-10 19:29:24 -07:00
Krrish Dholakia
fdd5ba3084
feat(credential_accessor.py): support loading in credentials from credential_list
...
Resolves https://github.com/BerriAI/litellm/issues/9114
2025-03-10 17:15:58 -07:00
Krrish Dholakia
bfbe26b91d
feat(azure.py): add azure bad request error support
2025-03-10 15:59:06 -07:00
Krrish Dholakia
5f87dc229a
feat(openai.py): bubble all error information back to client
2025-03-10 15:27:43 -07:00
Krish Dholakia
f899b828cf
Support openrouter reasoning_content
on streaming ( #9094 )
...
* feat(convert_dict_to_response.py): support openrouter format of reasoning content
* fix(transformation.py): fix openrouter streaming with reasoning content
Fixes https://github.com/BerriAI/litellm/issues/8193#issuecomment-270892962
* fix: fix type error
2025-03-09 20:03:59 -07:00
Krish Dholakia
e00d4fb18c
Litellm dev 03 08 2025 p3 ( #9089 )
...
* feat(ollama_chat.py): pass down http client to ollama_chat
enables easier testing
* fix(factory.py): fix passing images to ollama's `/api/generate` endpoint
Fixes https://github.com/BerriAI/litellm/issues/6683
* fix(factory.py): fix ollama pt to handle templating correctly
2025-03-09 18:20:56 -07:00
Krish Dholakia
4330ef8e81
Fix batches api cost tracking + Log batch models in spend logs / standard logging payload ( #9077 )
...
Read Version from pyproject.toml / read-version (push) Successful in 42s
* feat(batches/): fix batch cost calculation - ensure it's accurate
use the correct cost value - prev. defaulting to non-batch cost
* feat(batch_utils.py): log batch models to spend logs + standard logging payload
makes it easy to understand how cost was calculated
* fix: fix stored payload for test
* test: fix test
2025-03-08 11:47:25 -08:00
Ishaan Jaff
e2d612efd9
Bug fix - String data: stripped from entire content in streamed Gemini responses ( #9070 )
...
* _strip_sse_data_from_chunk
* use _strip_sse_data_from_chunk
* use _strip_sse_data_from_chunk
* use _strip_sse_data_from_chunk
* _strip_sse_data_from_chunk
* test_strip_sse_data_from_chunk
* _strip_sse_data_from_chunk
* testing
* _strip_sse_data_from_chunk
2025-03-07 21:06:39 -08:00
Krish Dholakia
0e3caf92b9
UI - new API Playground for testing LiteLLM translation ( #9073 )
...
* feat: initial commit - enable dev to see translated request
* feat(utils.py): expose new endpoint - `/utils/transform_request` to see the raw request sent by litellm
* feat(transform_request.tsx): allow user to see their transformed request
* refactor(litellm_logging.py): return raw request in 3 parts - api_base, headers, request body
easier to render each individually on UI vs. extracting from combined string
* feat: transform_request.tsx
working e2e raw request viewing
* fix(litellm_logging.py): fix transform viewing for bedrock models
* fix(litellm_logging.py): don't return sensitive headers in raw request headers
prevent accidental leak
* feat(transform_request.tsx): style improvements
2025-03-07 19:39:31 -08:00
Ishaan Jaff
b02af305de
[Feat] - Display thinking
tokens on OpenWebUI (Bedrock, Anthropic, Deepseek) ( #9029 )
...
Read Version from pyproject.toml / read-version (push) Successful in 14s
* if merge_reasoning_content_in_choices
* _optional_combine_thinking_block_in_choices
* stash changes
* working merge_reasoning_content_in_choices with bedrock
* fix litellm_params accessor
* fix streaming handler
* merge_reasoning_content_in_choices
* _optional_combine_thinking_block_in_choices
* test_bedrock_stream_thinking_content_openwebui
* merge_reasoning_content_in_choices
* fix for _optional_combine_thinking_block_in_choices
* linting error fix
2025-03-06 18:32:58 -08:00
Ishaan Jaff
f47987e673
(Refactor) /v1/messages
to follow simpler logic for Anthropic API spec ( #9013 )
...
* anthropic_messages_handler v0
* fix /messages
* working messages with router methods
* test_anthropic_messages_handler_litellm_router_non_streaming
* test_anthropic_messages_litellm_router_non_streaming_with_logging
* AnthropicMessagesConfig
* _handle_anthropic_messages_response_logging
* working with /v1/messages endpoint
* working /v1/messages endpoint
* refactor to use router factory function
* use aanthropic_messages
* use BaseConfig for Anthropic /v1/messages
* track api key, team on /v1/messages endpoint
* fix get_logging_payload
* BaseAnthropicMessagesTest
* align test config
* test_anthropic_messages_with_thinking
* test_anthropic_streaming_with_thinking
* fix - display anthropic url for debugging
* test_bad_request_error_handling
* test_anthropic_messages_router_streaming_with_bad_request
* fix ProxyException
* test_bad_request_error_handling_streaming
* use provider_specific_header
* test_anthropic_messages_with_extra_headers
* test_anthropic_messages_to_wildcard_model
* fix gcs pub sub test
* standard_logging_payload
* fix unit testing for anthopic /v1/messages support
* fix pass through anthropic messages api
* delete dead code
* fix anthropic pass through response
* revert change to spend tracking utils
* fix get_litellm_metadata_from_kwargs
* fix spend logs payload json
* proxy_pass_through_endpoint_tests
* TestAnthropicPassthroughBasic
* fix pass through tests
* test_async_vertex_proxy_route_api_key_auth
* _handle_anthropic_messages_response_logging
* vertex_credentials
* test_set_default_vertex_config
* test_anthropic_messages_litellm_router_non_streaming_with_logging
* test_ageneric_api_call_with_fallbacks_basic
* test__aadapter_completion
2025-03-06 00:43:08 -08:00
Krish Dholakia
f6535ae6ad
Support format
param for specifying image type ( #9019 )
...
* fix(transformation.py): support a 'format' parameter for image's
allow user to specify mime type
* fix: pass mimetype via 'format' param
* feat(gemini/chat/transformation.py): support 'format' param for gemini
* fix(factory.py): support 'format' param on sync bedrock converse calls
* feat(bedrock/converse_transformation.py): support 'format' param for bedrock async calls
* refactor(factory.py): move to supporting 'format' param in base helper
ensures consistency in param support
* feat(gpt_transformation.py): filter out 'format' param
don't send invalid param to openai
* fix(gpt_transformation.py): fix translation
* fix: fix translation error
2025-03-05 19:52:53 -08:00
Krish Dholakia
ec4f665e29
Return signature
on anthropic streaming + migrate to signature
field instead of signature_delta
[MINOR bump] ( #9021 )
...
* Fix missing signature_delta in thinking blocks when streaming from Claude 3.7 (#8797 )
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* test: update test to enforce signature found
* feat(refactor-signature-param-to-be-'signature'-instead-of-'signature_delta'): keeps it in sync with anthropic
* fix: fix linting error
---------
Co-authored-by: Martin Krasser <krasserm@googlemail.com>
2025-03-05 19:33:54 -08:00
Krish Dholakia
5e386c28b2
Litellm dev 03 04 2025 p3 ( #8997 )
...
* fix(core_helpers.py): handle litellm_metadata instead of 'metadata'
* feat(batches/): ensure batches logs are written to db
makes batches response dict compatible
* fix(cost_calculator.py): handle batch response being a dictionary
* fix(batches/main.py): modify retrieve endpoints to use @client decorator
enables logging to work on retrieve call
* fix(batches/main.py): fix retrieve batch response type to be 'dict' compatible
* fix(spend_tracking_utils.py): send unique uuid for retrieve batch call type
create batch and retrieve batch share the same id
* fix(spend_tracking_utils.py): prevent duplicate retrieve batch calls from being double counted
* refactor(batches/): refactor cost tracking for batches - do it on retrieve, and within the established litellm_logging pipeline
ensures cost is always logged to db
* fix: fix linting errors
* fix: fix linting error
2025-03-04 21:58:03 -08:00
Krish Dholakia
662c59adcf
Support caching on reasoning content + other fixes ( #8973 )
...
* fix(factory.py): pass on anthropic thinking content from assistant call
* fix(factory.py): fix anthropic messages to handle thinking blocks
Fixes https://github.com/BerriAI/litellm/issues/8961
* fix(factory.py): fix bedrock handling for assistant content in messages
Fixes https://github.com/BerriAI/litellm/issues/8961
* feat(convert_dict_to_response.py): handle reasoning content + thinking blocks in chat completion block
ensures caching works for anthropic thinking block
* fix(convert_dict_to_response.py): pass all message params to delta block
ensures streaming delta also contains the reasoning content / thinking block
* test(test_prompt_factory.py): remove redundant test
anthropic now supports assistant as the first message
* fix(factory.py): fix linting errors
* fix: fix code qa
* test: remove falsy test
* fix(litellm_logging.py): fix str conversion
2025-03-04 21:12:16 -08:00
Sunny Wan
f2c2266fd7
Merge branch 'BerriAI:main' into main
2025-03-03 21:37:43 -05:00