Krish Dholakia
217681eb5e
Litellm dev 04 22 2025 p1 ( #10206 )
...
* fix(openai.py): initial commit adding generic event type for openai responses api streaming
Ensures handling for undocumented event types - e.g. "response.reasoning_summary_part.added"
* fix(transformation.py): handle unknown openai response type
* fix(datadog_llm_observability.py): handle dict[str, any] -> dict[str, str] conversion
Fixes https://github.com/BerriAI/litellm/issues/9494
* test: add more unit testing
* test: add unit test
* fix(common_utils.py): fix message with content list
* test: update testing
2025-04-22 23:58:43 -07:00
Ishaan Jaff
868cdd0226
[Feat] Add Support for DELETE /v1/responses/{response_id} on OpenAI, Azure OpenAI ( #10205 )
...
* add transform_delete_response_api_request to base responses config
* add transform_delete_response_api_request
* add delete_response_api_handler
* fixes for deleting responses, response API
* add adelete_responses
* add async test_basic_openai_responses_delete_endpoint
* test_basic_openai_responses_delete_endpoint
* working delete for streaming on responses API
* fixes azure transformation
* TestAnthropicResponsesAPITest
* fix code check
* fix linting
* fixes for get_complete_url
* test_basic_openai_responses_streaming_delete_endpoint
* streaming fixes
2025-04-22 18:27:03 -07:00
Krish Dholakia
a7db0df043
Gemini-2.5-flash improvements ( #10198 )
...
* fix(vertex_and_google_ai_studio_gemini.py): allow thinking budget = 0
Fixes https://github.com/BerriAI/litellm/issues/10121
* fix(vertex_and_google_ai_studio_gemini.py): handle nuance in counting exclusive vs. inclusive tokens
Addresses https://github.com/BerriAI/litellm/pull/10141#discussion_r2052272035
2025-04-21 22:48:00 -07:00
Li Yang
10257426a2
fix(bedrock): wrong system prompt transformation ( #10120 )
...
Read Version from pyproject.toml / read-version (push) Successful in 16s
Helm unit test / unit-test (push) Successful in 25s
* fix(bedrock): wrong system transformation
* chore: add one more test case
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
2025-04-21 08:48:14 -07:00
Krish Dholakia
03b5399f86
test(utils.py): handle scenario where text tokens + reasoning tokens … ( #10165 )
...
* test(utils.py): handle scenario where text tokens + reasoning tokens set, but reasoning tokens not charged separately
Addresses https://github.com/BerriAI/litellm/pull/10141#discussion_r2051555332
* fix(vertex_and_google_ai_studio.py): only set content if non-empty str
2025-04-19 12:32:38 -07:00
Krish Dholakia
f08a4e3c06
Support 'file' message type for VLLM video url's + Anthropic redacted message thinking support ( #10129 )
...
* feat(hosted_vllm/chat/transformation.py): support calling vllm video url with openai 'file' message type
allows switching between gemini/vllm easily
* [WIP] redacted thinking tests (#9044 )
* WIP: redacted thinking tests
* test: add test for redacted thinking in assistant message
---------
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
* fix(anthropic/chat/transformation.py): support redacted thinking block on anthropic completion
Fixes https://github.com/BerriAI/litellm/issues/9058
* fix(anthropic/chat/handler.py): transform anthropic redacted messages on streaming
Fixes https://github.com/BerriAI/litellm/issues/9058
* fix(bedrock/): support redacted text on streaming + non-streaming
Fixes https://github.com/BerriAI/litellm/issues/9058
* feat(litellm_proxy/chat/transformation.py): support 'reasoning_effort' param for proxy
allows using reasoning effort with thinking models on proxy
* test: update tests
* fix(utils.py): fix linting error
* fix: fix linting errors
* fix: fix linting errors
* fix: fix linting error
* fix: fix linting errors
* fix(anthropic/chat/transformation.py): fix returning citations in chat completion
---------
Co-authored-by: Johann Miller <22018973+johannkm@users.noreply.github.com>
2025-04-19 11:16:37 -07:00
Krish Dholakia
2508ca71cb
Handle fireworks ai tool calling response ( #10130 )
...
* feat(fireworks_ai/chat): handle tool calling with fireworks ai correctly
Fixes https://github.com/BerriAI/litellm/issues/7209
* fix(utils.py): handle none type in message
* fix: fix model name in test
* fix(utils.py): fix validate check for openai messages
* fix: fix model returned
* fix(main.py): fix text completion routing
* test: update testing
* test: skip test - cohere having RBAC issues
2025-04-19 09:37:45 -07:00
Krish Dholakia
36308a31be
Gemini-2.5-flash - support reasoning cost calc + return reasoning content ( #10141 )
...
* build(model_prices_and_context_window.json): add vertex ai gemini-2.5-flash pricing
* build(model_prices_and_context_window.json): add gemini reasoning token pricing
* fix(vertex_and_google_ai_studio_gemini.py): support counting thinking tokens for gemini
allows accurate cost calc
* fix(utils.py): add reasoning token cost calc to generic cost calc
ensures gemini-2.5-flash cost calculation is accurate
* build(model_prices_and_context_window.json): mark gemini-2.5-flash as 'supports_reasoning'
* feat(gemini/): support 'thinking' + 'reasoning_effort' params + new unit tests
allow controlling thinking effort for gemini-2.5-flash models
* test: update unit testing
* feat(vertex_and_google_ai_studio_gemini.py): return reasoning content if given in gemini response
* test: update model name
* fix: fix ruff check
* test(test_spend_management_endpoints.py): update tests to be less sensitive to new keys / updates to usage object
* fix(vertex_and_google_ai_studio_gemini.py): fix translation
2025-04-19 09:20:52 -07:00
Ishaan Jaff
d3e04eac7f
[Feat] Unified Responses API - Add Azure Responses API support ( #10116 )
...
* initial commit for azure responses api support
* update get complete url
* fixes for responses API
* working azure responses API
* working responses API
* test suite for responses API
* azure responses API test suite
* fix test with complete url
* fix test refactor
* test fix metadata checks
* fix code quality check
2025-04-17 16:47:59 -07:00
Krish Dholakia
fdfa1108a6
Add property ordering for vertex ai schema ( #9828 ) + Fix combining multiple tool calls ( #10040 )
...
* fix #9783 : Retain schema field ordering for google gemini and vertex (#9828 )
* test: update test
* refactor(groq.py): initial commit migrating groq to base_llm_http_handler
* fix(streaming_chunk_builder_utils.py): fix how tool content is combined
Fixes https://github.com/BerriAI/litellm/issues/10034
* fix(vertex_ai/common_utils.py): prevent infinite loop in helper function
* fix(groq/chat/transformation.py): handle groq streaming errors correctly
* fix(groq/chat/transformation.py): handle max_retries
---------
Co-authored-by: Adrian Lyjak <adrian@chatmeter.com>
2025-04-15 22:29:25 -07:00
Krish Dholakia
d3e7a137ad
Revert "fix #9783 : Retain schema field ordering for google gemini and vertex …" ( #10038 )
...
This reverts commit e3729f9855
.
2025-04-15 19:21:33 -07:00
Adrian Lyjak
e3729f9855
fix #9783 : Retain schema field ordering for google gemini and vertex ( #9828 )
2025-04-15 19:12:02 -07:00
Marc Abramowitz
837a6948d8
Fix typo: Entrata -> Entra in code ( #9922 )
...
* Fix typo: Entrata -> Entra
* Fix a few more
2025-04-15 17:31:18 -07:00
Krish Dholakia
6b5f093087
Revert "Fix case where only system messages are passed to Gemini ( #9992 )" ( #10027 )
...
This reverts commit 2afd922f8c
.
2025-04-15 13:34:03 -07:00
Nolan Tremelling
2afd922f8c
Fix case where only system messages are passed to Gemini ( #9992 )
2025-04-15 13:30:49 -07:00
Krish Dholakia
8faf56922c
Fix azure tenant id check from env var + response_format check on api_version 2025+ ( #9993 )
...
* fix(azure/common_utils.py): check for azure tenant id, client id, client secret in env var
Fixes https://github.com/BerriAI/litellm/issues/9598#issuecomment-2801966027
* fix(azure/gpt_transformation.py): fix passing response_format to azure when api year = 2025
Fixes https://github.com/BerriAI/litellm/issues/9703
* test: monkeypatch azure api version in test
* test: update testing
* test: fix test
* test: update test
* docs(config_settings.md): document env vars
2025-04-14 22:02:35 -07:00
Krish Dholakia
b9f01c9f5b
fix(databricks/common_utils.py): fix custom endpoint check ( #9925 )
...
* fix(databricks/common_utils.py): fix custom endpoint check
Fixes https://github.com/BerriAI/litellm/issues/9915
* fix(common_utils.py): add unit test to ensure custom_endpoint=False is handled correctly
Fixes https://github.com/BerriAI/litellm/issues/9915
2025-04-11 23:20:49 -07:00
Krish Dholakia
3ca82c22b6
Support CRUD endpoints for Managed Files ( #9924 )
...
* fix(openai.py): ensure openai file object shows up on logs
* fix(managed_files.py): return unified file id as b64 str
allows retrieve file id to work as expected
* fix(managed_files.py): apply decoded file id transformation
* fix: add unit test for file id + decode logic
* fix: initial commit for litellm_proxy support with CRUD Endpoints
* fix(managed_files.py): support retrieve file operation
* fix(managed_files.py): support for DELETE endpoint for files
* fix(managed_files.py): retrieve file content support
supports retrieve file content api from openai
* fix: fix linting error
* test: update tests
* fix: fix linting error
* fix(files/main.py): pass litellm params to azure route
* test: fix test
2025-04-11 21:48:27 -07:00
Krish Dholakia
78879c68a9
Revert avglogprobs change + Add azure/gpt-4o-realtime-audio cost tracking ( #9893 )
...
* test: initial commit fixing gemini logprobs
Fixes https://github.com/BerriAI/litellm/issues/9888
* fix(vertex_and_google_ai_studio.py): Revert avglogprobs change
Fixes https://github.com/BerriAI/litellm/issues/8890
* build(model_prices_and_context_window.json): add gpt-4o-realtime-preview cost to model cost map
Fixes https://github.com/BerriAI/litellm/issues/9814
* test: add cost calculation unit testing
* test: fix test
* test: update test
2025-04-10 21:23:55 -07:00
Krish Dholakia
87733c8193
Fix anthropic prompt caching cost calc + trim logged message in db ( #9838 )
...
* fix(spend_tracking_utils.py): prevent logging entire mp4 files to db
Fixes https://github.com/BerriAI/litellm/issues/9732
* fix(anthropic/chat/transformation.py): Fix double counting cache creation input tokens
Fixes https://github.com/BerriAI/litellm/issues/9812
* refactor(anthropic/chat/transformation.py): refactor streaming to use same usage calculation block as non-streaming
reduce errors
* fix(bedrock/chat/converse_transformation.py): don't increment prompt tokens with cache_creation_input_tokens
* build: remove redisvl from requirements.txt (temporary)
* fix(spend_tracking_utils.py): handle circular references
* test: update code cov test
* test: update test
2025-04-09 21:26:43 -07:00
Krish Dholakia
5099aac1a5
Add DBRX Anthropic w/ thinking + response_format support ( #9744 )
...
* feat(databricks/chat/): add anthropic w/ reasoning content support via databricks
Allows user to call claude-3-7-sonnet with thinking via databricks
* refactor: refactor choices transformation + add unit testing
* fix(databricks/chat/transformation.py): support thinking blocks on databricks response streaming
* feat(databricks/chat/transformation.py): support response_format for claude models
* fix(databricks/chat/transformation.py): correctly handle response_format={"type": "text"}
* feat(databricks/chat/transformation.py): support 'reasoning_effort' param mapping for anthropic
* fix: fix ruff errors
* fix: fix linting error
* test: update test
* fix(databricks/chat/transformation.py): handle json mode output parsing
* fix(databricks/chat/transformation.py): handle json mode on streaming
* test: update test
* test: update dbrx testing
* test: update testing
* fix(base_model_iterator.py): handle non-json chunk
* test: update tests
* fix: fix ruff check
* fix: fix databricks config import
* fix: handle _tool = none
* test: skip invalid test
2025-04-04 22:13:32 -07:00
Krish Dholakia
e1f7bcb47d
Fix VertexAI Credential Caching issue ( #9756 )
...
* refactor(vertex_llm_base.py): Prevent credential misrouting for projects
Fixes https://github.com/BerriAI/litellm/issues/7904
* fix: passing unit tests
* fix(vertex_llm_base.py): common auth logic across sync + async vertex ai calls
prevents credential caching issue across both flows
* test: fix test
* fix(vertex_llm_base.py): handle project id in default cause
* fix(factory.py): don't pass cache control if not set
bedrock invoke does not support this
* test: fix test
* fix(vertex_llm_base.py): add .exception message in load_auth
* fix: fix ruff error
2025-04-04 16:38:08 -07:00
Adrian Lyjak
d640bc0a00
fix #8425 , passthrough kwargs during acompletion, and unwrap extra_body for openrouter ( #9747 )
2025-04-03 22:19:40 -07:00
sajda
4a4328b5bb
fix:Gemini Flash 2.0 implementation is not returning the logprobs ( #9713 )
...
* fix:Gemini Flash 2.0 implementation is not returning the logprobs
* fix: linting error by adding a helper method called _process_candidates
2025-04-03 11:53:41 -07:00
Pranav Simha
2e35f07e94
Add support for max_completion_tokens to the Cohere chat transformation config ( #9701 )
2025-04-02 07:50:44 -07:00
Krish Dholakia
5ad2fbcba6
Openrouter streaming fixes + Anthropic 'file' message support ( #9667 )
...
* fix(openrouter/transformation.py): Handle error in openrouter stream
Fixes https://github.com/Aider-AI/aider/issues/3550
* test(test_openrouter_chat_transformation.py): add unit tests
* feat(anthropic/chat/transformation.py): add openai 'file' message content type support
Closes https://github.com/BerriAI/litellm/issues/9463
* fix(factory.py): add bedrock converse support for openai 'file' message content type
Closes https://github.com/BerriAI/litellm/issues/9463
2025-03-31 21:22:59 -07:00
Krish Dholakia
46b3dbde8f
Revert "fix: Anthropic prompt caching on GCP Vertex AI ( #9605 )" ( #9670 )
...
This reverts commit a8673246dc
.
2025-03-31 17:13:55 -07:00
Ishaan Jaff
ca4ed9ff2e
ref issue
2025-03-31 16:05:10 -07:00
Ishaan Jaff
bc66827537
test_aiter_bytes_valid_chunk_followed_by_unicode_error
2025-03-31 16:04:38 -07:00
Sam
a8673246dc
fix: Anthropic prompt caching on GCP Vertex AI ( #9605 )
...
* fix: Anthropic prompt caching on GCP Vertex AI
* test(vertex): anthropic prompt caching
2025-03-29 23:40:34 -07:00
Krish Dholakia
5ac61a7572
Add bedrock latency optimized inference support ( #9623 )
...
* fix(converse_transformation.py): add performanceConfig param support on bedrock
Closes https://github.com/BerriAI/litellm/issues/7606
* fix(converse_transformation.py): refactor to use more flexible single getter for params which are separate config blocks
* test(test_main.py): add e2e mock test for bedrock performance config
* build(model_prices_and_context_window.json): add versioned multimodal embedding
* refactor(multimodal_embeddings/): migrate to config pattern
* feat(vertex_ai/multimodalembeddings): calculate usage for multimodal embedding calls
Enables cost calculation for multimodal embeddings
* feat(vertex_ai/multimodalembeddings): get usage object for embedding calls
ensures accurate cost tracking for vertexai multimodal embedding calls
* fix(embedding_handler.py): remove unused imports
* fix: fix linting errors
* fix: handle response api usage calculation
* test(test_vertex_ai_multimodal_embedding_transformation.py): update tests
* test: mark flaky test
* feat(vertex_ai/multimodal_embeddings/transformation.py): support text+image+video input
* docs(vertex.md): document sending text + image to vertex multimodal embeddings
* test: remove incorrect file
* fix(multimodal_embeddings/transformation.py): fix linting error
* style: remove unused import
2025-03-29 00:23:09 -07:00
Nicholas Grabar
09daeac188
Rebasing 2
2025-03-28 15:18:09 -07:00
Nicholas Grabar
06a45706b2
Rebase 3
2025-03-28 15:18:05 -07:00
Nicholas Grabar
1f2bbda11d
Add recursion depth to convert_anyof_null_to_nullable, constants.py. Fix recursive_detector.py raise error state
2025-03-28 13:11:19 -07:00
NickGrab
b72fbdde74
Merge branch 'main' into litellm_8864-feature-vertex-anyOf-support
2025-03-28 10:25:04 -07:00
Krish Dholakia
b9d0f460e8
Revert "Support max_completion_tokens on Mistral ( #9589 )" ( #9604 )
...
This reverts commit fef5d23dd5
.
2025-03-27 19:14:26 -07:00
Chris Mancuso
fef5d23dd5
Support max_completion_tokens on Mistral ( #9589 )
...
* Support max_completion_tokens on Mistral
* test fix
2025-03-27 17:27:19 -07:00
Krish Dholakia
c0845fec1f
Add OpenAI gpt-4o-transcribe support ( #9517 )
...
* refactor: introduce new transformation config for gpt-4o-transcribe models
* refactor: expose new transformation configs for audio transcription
* ci: fix config yml
* feat(openai/transcriptions): support provider config transformation on openai audio transcriptions
allows gpt-4o and whisper audio transformation to work as expected
* refactor: migrate fireworks ai + deepgram to new transform request pattern
* feat(openai/): working support for gpt-4o-audio-transcribe
* build(model_prices_and_context_window.json): add gpt-4o-transcribe to model cost map
* build(model_prices_and_context_window.json): specify what endpoints are supported for `/audio/transcriptions`
* fix(get_supported_openai_params.py): fix return
* refactor(deepgram/): migrate unit test to deepgram handler
* refactor: cleanup unused imports
* fix(get_supported_openai_params.py): fix linting error
* test: update test
2025-03-26 23:10:25 -07:00
Ishaan Jaff
0aae9aa24a
rename _is_model_gemini_spec_model
2025-03-26 14:28:26 -07:00
Ishaan Jaff
c38b41f65b
test_get_supports_system_message
2025-03-26 14:26:08 -07:00
Ishaan Jaff
72f08bc6ea
unit tests for VertexGeminiConfig
2025-03-26 14:21:35 -07:00
Nicholas Grabar
f68cc26f15
8864 Add support for anyOf union type while handling null fields
2025-03-25 22:37:28 -07:00
Krish Dholakia
92883560f0
fix vertex ai multimodal embedding translation ( #9471 )
...
Read Version from pyproject.toml / read-version (push) Successful in 20s
Helm unit test / unit-test (push) Successful in 24s
* remove data:image/jpeg;base64, prefix from base64 image input
vertex_ai's multimodal embeddings endpoint expects a raw base64 string without `data:image/jpeg;base64,` prefix.
* Add Vertex Multimodal Embedding Test
* fix(test_vertex.py): add e2e tests on multimodal embeddings
* test: unit testing
* test: remove sklearn dep
* test: update test with fixed route
* test: fix test
---------
Co-authored-by: Jonarod <jonrodd@gmail.com>
Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com>
2025-03-24 23:23:28 -07:00
Krish Dholakia
a619580bf8
Add vertexai topLogprobs support ( #9518 )
...
* Added support for top_logprobs in vertex gemini models
* Testing for top_logprobs feature in vertexai
* Update litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py
Co-authored-by: Tom Matthews <tomukmatthews@gmail.com>
* refactor(tests/): refactor testing to be in correct repo
---------
Co-authored-by: Aditya Thaker <adityathaker28@gmail.com>
Co-authored-by: Tom Matthews <tomukmatthews@gmail.com>
2025-03-24 22:42:38 -07:00
Krrish Dholakia
3ce3689282
test: migrate testing
2025-03-22 12:48:53 -07:00
Krrish Dholakia
81a1494a51
test: add unit testing
2025-03-21 10:35:36 -07:00
Ishaan Jaff
15048de5e2
test_prepare_fake_stream_request
2025-03-20 14:50:00 -07:00
Ishaan Jaff
247e4d09ee
Merge branch 'main' into litellm_fix_ssl_verify
2025-03-19 21:03:06 -07:00
Krrish Dholakia
9adad381b4
fix(common_utils.py): handle cris only model
...
Fixes https://github.com/BerriAI/litellm/issues/9161#issuecomment-2734905153
2025-03-18 23:35:43 -07:00
Ishaan Jaff
65083ca8da
get_openai_client_cache_key
2025-03-18 18:35:50 -07:00