Commit graph

63 commits

Author SHA1 Message Date
Krish Dholakia
b9f01c9f5b
fix(databricks/common_utils.py): fix custom endpoint check (#9925)
* fix(databricks/common_utils.py): fix custom endpoint check

Fixes https://github.com/BerriAI/litellm/issues/9915

* fix(common_utils.py): add unit test to ensure custom_endpoint=False is handled correctly

Fixes https://github.com/BerriAI/litellm/issues/9915
2025-04-11 23:20:49 -07:00
Krish Dholakia
3ca82c22b6
Support CRUD endpoints for Managed Files (#9924)
* fix(openai.py): ensure openai file object shows up on logs

* fix(managed_files.py): return unified file id as b64 str

allows retrieve file id to work as expected

* fix(managed_files.py): apply decoded file id transformation

* fix: add unit test for file id + decode logic

* fix: initial commit for litellm_proxy support with CRUD Endpoints

* fix(managed_files.py): support retrieve file operation

* fix(managed_files.py): support for DELETE endpoint for files

* fix(managed_files.py): retrieve file content support

supports retrieve file content api from openai

* fix: fix linting error

* test: update tests

* fix: fix linting error

* fix(files/main.py): pass litellm params to azure route

* test: fix test
2025-04-11 21:48:27 -07:00
Krish Dholakia
78879c68a9
Revert avglogprobs change + Add azure/gpt-4o-realtime-audio cost tracking (#9893)
* test: initial commit fixing gemini logprobs

Fixes https://github.com/BerriAI/litellm/issues/9888

* fix(vertex_and_google_ai_studio.py): Revert avglogprobs change

Fixes https://github.com/BerriAI/litellm/issues/8890

* build(model_prices_and_context_window.json): add gpt-4o-realtime-preview cost to model cost map

Fixes https://github.com/BerriAI/litellm/issues/9814

* test: add cost calculation unit testing

* test: fix test

* test: update test
2025-04-10 21:23:55 -07:00
Krish Dholakia
87733c8193
Fix anthropic prompt caching cost calc + trim logged message in db (#9838)
* fix(spend_tracking_utils.py): prevent logging entire mp4 files to db

Fixes https://github.com/BerriAI/litellm/issues/9732

* fix(anthropic/chat/transformation.py): Fix double counting cache creation input tokens

Fixes https://github.com/BerriAI/litellm/issues/9812

* refactor(anthropic/chat/transformation.py): refactor streaming to use same usage calculation block as non-streaming

reduce errors

* fix(bedrock/chat/converse_transformation.py): don't increment prompt tokens with cache_creation_input_tokens

* build: remove redisvl from requirements.txt (temporary)

* fix(spend_tracking_utils.py): handle circular references

* test: update code cov test

* test: update test
2025-04-09 21:26:43 -07:00
Krish Dholakia
5099aac1a5
Add DBRX Anthropic w/ thinking + response_format support (#9744)
* feat(databricks/chat/): add anthropic w/ reasoning content support via databricks

Allows user to call claude-3-7-sonnet with thinking via databricks

* refactor: refactor choices transformation + add unit testing

* fix(databricks/chat/transformation.py): support thinking blocks on databricks response streaming

* feat(databricks/chat/transformation.py): support response_format for claude models

* fix(databricks/chat/transformation.py): correctly handle response_format={"type": "text"}

* feat(databricks/chat/transformation.py): support 'reasoning_effort' param mapping for anthropic

* fix: fix ruff errors

* fix: fix linting error

* test: update test

* fix(databricks/chat/transformation.py): handle json mode output parsing

* fix(databricks/chat/transformation.py): handle json mode on streaming

* test: update test

* test: update dbrx testing

* test: update testing

* fix(base_model_iterator.py): handle non-json chunk

* test: update tests

* fix: fix ruff check

* fix: fix databricks config import

* fix: handle _tool = none

* test: skip invalid test
2025-04-04 22:13:32 -07:00
Krish Dholakia
e1f7bcb47d
Fix VertexAI Credential Caching issue (#9756)
* refactor(vertex_llm_base.py): Prevent credential misrouting for projects

Fixes https://github.com/BerriAI/litellm/issues/7904

* fix: passing unit tests

* fix(vertex_llm_base.py): common auth logic across sync + async vertex ai calls

prevents credential caching issue across both flows

* test: fix test

* fix(vertex_llm_base.py): handle project id in default cause

* fix(factory.py): don't pass cache control if not set

bedrock invoke does not support this

* test: fix test

* fix(vertex_llm_base.py): add .exception message in load_auth

* fix: fix ruff error
2025-04-04 16:38:08 -07:00
Adrian Lyjak
d640bc0a00
fix #8425, passthrough kwargs during acompletion, and unwrap extra_body for openrouter (#9747) 2025-04-03 22:19:40 -07:00
sajda
4a4328b5bb
fix:Gemini Flash 2.0 implementation is not returning the logprobs (#9713)
* fix:Gemini Flash 2.0 implementation is not returning the logprobs

* fix: linting error by adding a helper method called _process_candidates
2025-04-03 11:53:41 -07:00
Pranav Simha
2e35f07e94
Add support for max_completion_tokens to the Cohere chat transformation config (#9701) 2025-04-02 07:50:44 -07:00
Krish Dholakia
5ad2fbcba6
Openrouter streaming fixes + Anthropic 'file' message support (#9667)
* fix(openrouter/transformation.py): Handle error in openrouter stream

Fixes https://github.com/Aider-AI/aider/issues/3550

* test(test_openrouter_chat_transformation.py): add unit tests

* feat(anthropic/chat/transformation.py): add openai 'file' message content type support

Closes https://github.com/BerriAI/litellm/issues/9463

* fix(factory.py): add bedrock converse support for openai 'file' message content type

Closes https://github.com/BerriAI/litellm/issues/9463
2025-03-31 21:22:59 -07:00
Krish Dholakia
46b3dbde8f
Revert "fix: Anthropic prompt caching on GCP Vertex AI (#9605)" (#9670)
This reverts commit a8673246dc.
2025-03-31 17:13:55 -07:00
Ishaan Jaff
ca4ed9ff2e ref issue 2025-03-31 16:05:10 -07:00
Ishaan Jaff
bc66827537 test_aiter_bytes_valid_chunk_followed_by_unicode_error 2025-03-31 16:04:38 -07:00
Sam
a8673246dc
fix: Anthropic prompt caching on GCP Vertex AI (#9605)
* fix: Anthropic prompt caching on GCP Vertex AI

* test(vertex): anthropic prompt caching
2025-03-29 23:40:34 -07:00
Krish Dholakia
5ac61a7572
Add bedrock latency optimized inference support (#9623)
* fix(converse_transformation.py): add performanceConfig param support on bedrock

Closes https://github.com/BerriAI/litellm/issues/7606

* fix(converse_transformation.py): refactor to use more flexible single getter for params which are separate config blocks

* test(test_main.py): add e2e mock test for bedrock performance config

* build(model_prices_and_context_window.json): add versioned multimodal embedding

* refactor(multimodal_embeddings/): migrate to config pattern

* feat(vertex_ai/multimodalembeddings): calculate usage for multimodal embedding calls

Enables cost calculation for multimodal embeddings

* feat(vertex_ai/multimodalembeddings): get usage object for embedding calls

ensures accurate cost tracking for vertexai multimodal embedding calls

* fix(embedding_handler.py): remove unused imports

* fix: fix linting errors

* fix: handle response api usage calculation

* test(test_vertex_ai_multimodal_embedding_transformation.py): update tests

* test: mark flaky test

* feat(vertex_ai/multimodal_embeddings/transformation.py): support text+image+video input

* docs(vertex.md): document sending text + image to vertex multimodal embeddings

* test: remove incorrect file

* fix(multimodal_embeddings/transformation.py): fix linting error

* style: remove unused import
2025-03-29 00:23:09 -07:00
Nicholas Grabar
09daeac188 Rebasing 2 2025-03-28 15:18:09 -07:00
Nicholas Grabar
06a45706b2 Rebase 3 2025-03-28 15:18:05 -07:00
Nicholas Grabar
1f2bbda11d Add recursion depth to convert_anyof_null_to_nullable, constants.py. Fix recursive_detector.py raise error state 2025-03-28 13:11:19 -07:00
NickGrab
b72fbdde74
Merge branch 'main' into litellm_8864-feature-vertex-anyOf-support 2025-03-28 10:25:04 -07:00
Krish Dholakia
b9d0f460e8
Revert "Support max_completion_tokens on Mistral (#9589)" (#9604)
This reverts commit fef5d23dd5.
2025-03-27 19:14:26 -07:00
Chris Mancuso
fef5d23dd5
Support max_completion_tokens on Mistral (#9589)
* Support max_completion_tokens on Mistral

* test fix
2025-03-27 17:27:19 -07:00
Krish Dholakia
c0845fec1f
Add OpenAI gpt-4o-transcribe support (#9517)
* refactor: introduce new transformation config for gpt-4o-transcribe models

* refactor: expose new transformation configs for audio transcription

* ci: fix config yml

* feat(openai/transcriptions): support provider config transformation on openai audio transcriptions

allows gpt-4o and whisper audio transformation to work as expected

* refactor: migrate fireworks ai + deepgram to new transform request pattern

* feat(openai/): working support for gpt-4o-audio-transcribe

* build(model_prices_and_context_window.json): add gpt-4o-transcribe to model cost map

* build(model_prices_and_context_window.json): specify what endpoints are supported for `/audio/transcriptions`

* fix(get_supported_openai_params.py): fix return

* refactor(deepgram/): migrate unit test to deepgram handler

* refactor: cleanup unused imports

* fix(get_supported_openai_params.py): fix linting error

* test: update test
2025-03-26 23:10:25 -07:00
Ishaan Jaff
0aae9aa24a rename _is_model_gemini_spec_model 2025-03-26 14:28:26 -07:00
Ishaan Jaff
c38b41f65b test_get_supports_system_message 2025-03-26 14:26:08 -07:00
Ishaan Jaff
72f08bc6ea unit tests for VertexGeminiConfig 2025-03-26 14:21:35 -07:00
Nicholas Grabar
f68cc26f15 8864 Add support for anyOf union type while handling null fields 2025-03-25 22:37:28 -07:00
Krish Dholakia
92883560f0
fix vertex ai multimodal embedding translation (#9471)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 20s
Helm unit test / unit-test (push) Successful in 24s
* remove data:image/jpeg;base64, prefix from base64 image input

vertex_ai's multimodal embeddings endpoint expects a raw base64 string without `data:image/jpeg;base64,` prefix.

* Add Vertex Multimodal Embedding Test

* fix(test_vertex.py): add e2e tests on multimodal embeddings

* test: unit testing

* test: remove sklearn dep

* test: update test with fixed route

* test: fix test

---------

Co-authored-by: Jonarod <jonrodd@gmail.com>
Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com>
2025-03-24 23:23:28 -07:00
Krish Dholakia
a619580bf8
Add vertexai topLogprobs support (#9518)
* Added support for top_logprobs in vertex gemini models

* Testing for top_logprobs feature in vertexai

* Update litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py

Co-authored-by: Tom Matthews <tomukmatthews@gmail.com>

* refactor(tests/): refactor testing to be in correct repo

---------

Co-authored-by: Aditya Thaker <adityathaker28@gmail.com>
Co-authored-by: Tom Matthews <tomukmatthews@gmail.com>
2025-03-24 22:42:38 -07:00
Krrish Dholakia
3ce3689282 test: migrate testing 2025-03-22 12:48:53 -07:00
Krrish Dholakia
81a1494a51 test: add unit testing 2025-03-21 10:35:36 -07:00
Ishaan Jaff
15048de5e2 test_prepare_fake_stream_request 2025-03-20 14:50:00 -07:00
Ishaan Jaff
247e4d09ee
Merge branch 'main' into litellm_fix_ssl_verify 2025-03-19 21:03:06 -07:00
Krrish Dholakia
9adad381b4 fix(common_utils.py): handle cris only model
Fixes https://github.com/BerriAI/litellm/issues/9161#issuecomment-2734905153
2025-03-18 23:35:43 -07:00
Ishaan Jaff
65083ca8da get_openai_client_cache_key 2025-03-18 18:35:50 -07:00
Ishaan Jaff
40418c7bd8 test_openai_client_reuse 2025-03-18 18:13:36 -07:00
Ishaan Jaff
d5150e000d test openai common utils 2025-03-18 18:08:24 -07:00
Ishaan Jaff
55ea2370ba Union[TranscriptionResponse, Coroutine[Any, Any, TranscriptionResponse]]: 2025-03-18 14:23:14 -07:00
Ishaan Jaff
6987a73e36 initialize_azure_sdk_client 2025-03-18 12:27:17 -07:00
Krrish Dholakia
00b338cb9c test: refactor testing to handle routing correctly 2025-03-18 12:24:12 -07:00
Krrish Dholakia
8e27b2026a fix(http_handler.py): support reading ssl security level from env var
Allows user to specify lower security settings
2025-03-17 15:48:31 -07:00
Krish Dholakia
d4caaae1be
Merge pull request #9274 from BerriAI/litellm_contributor_rebase_branch
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 43s
Helm unit test / unit-test (push) Successful in 50s
Litellm contributor rebase branch
2025-03-14 21:57:49 -07:00
Krrish Dholakia
8a6e4715aa feat(converse_transformation.py): fix type for bedrock cache usage block 2025-03-13 19:33:22 -07:00
Krrish Dholakia
96bba9354e test(tests/litellm): add unit test for transform usage function 2025-03-13 15:56:55 -07:00
Krish Dholakia
2c011d9a93
Merge pull request #9123 from omrishiv/8911-fix-model-encoding
Fixes bedrock modelId encoding for Inference Profiles
2025-03-13 10:42:32 -07:00
sven
2d4e58cf23 add test code 2025-03-13 14:00:12 +09:00
Ishaan Jaff
d623183eee test_ensure_initialize_azure_sdk_client_always_used 2025-03-12 18:51:19 -07:00
Krish Dholakia
2d957a0ed9
Merge branch 'main' into litellm_dev_03_10_2025_p3 2025-03-12 14:56:01 -07:00
Ishaan Jaff
2460f3cbab test_validate_environment 2025-03-12 12:57:40 -07:00
Krrish Dholakia
23bf7b5700 fix(azure/completions): migrate completions endpoint to support base azure llm class
enables consistent auth logic across all azure calls
2025-03-11 18:47:30 -07:00
Krrish Dholakia
42af49cd87 fix: fix merge conflicts 2025-03-11 18:41:41 -07:00