Commit graph

26 commits

Author SHA1 Message Date
Krish Dholakia
a7db0df043
Gemini-2.5-flash improvements (#10198)
* fix(vertex_and_google_ai_studio_gemini.py): allow thinking budget = 0

Fixes https://github.com/BerriAI/litellm/issues/10121

* fix(vertex_and_google_ai_studio_gemini.py): handle nuance in counting exclusive vs. inclusive tokens

Addresses https://github.com/BerriAI/litellm/pull/10141#discussion_r2052272035
2025-04-21 22:48:00 -07:00
Krish Dholakia
03b5399f86
test(utils.py): handle scenario where text tokens + reasoning tokens … (#10165)
* test(utils.py): handle scenario where text tokens + reasoning tokens set, but reasoning tokens not charged separately

Addresses https://github.com/BerriAI/litellm/pull/10141#discussion_r2051555332

* fix(vertex_and_google_ai_studio.py): only set content if non-empty str
2025-04-19 12:32:38 -07:00
Krish Dholakia
36308a31be
Gemini-2.5-flash - support reasoning cost calc + return reasoning content (#10141)
* build(model_prices_and_context_window.json): add vertex ai gemini-2.5-flash pricing

* build(model_prices_and_context_window.json): add gemini reasoning token pricing

* fix(vertex_and_google_ai_studio_gemini.py): support counting thinking tokens for gemini

allows accurate cost calc

* fix(utils.py): add reasoning token cost calc to generic cost calc

ensures gemini-2.5-flash cost calculation is accurate

* build(model_prices_and_context_window.json): mark gemini-2.5-flash as 'supports_reasoning'

* feat(gemini/): support 'thinking' + 'reasoning_effort' params + new unit tests

allow controlling thinking effort for gemini-2.5-flash models

* test: update unit testing

* feat(vertex_and_google_ai_studio_gemini.py): return reasoning content if given in gemini response

* test: update model name

* fix: fix ruff check

* test(test_spend_management_endpoints.py): update tests to be less sensitive to new keys / updates to usage object

* fix(vertex_and_google_ai_studio_gemini.py): fix translation
2025-04-19 09:20:52 -07:00
Krish Dholakia
fdfa1108a6
Add property ordering for vertex ai schema (#9828) + Fix combining multiple tool calls (#10040)
* fix #9783: Retain schema field ordering for google gemini and vertex (#9828)

* test: update test

* refactor(groq.py): initial commit migrating groq to base_llm_http_handler

* fix(streaming_chunk_builder_utils.py): fix how tool content is combined

Fixes https://github.com/BerriAI/litellm/issues/10034

* fix(vertex_ai/common_utils.py): prevent infinite loop in helper function

* fix(groq/chat/transformation.py): handle groq streaming errors correctly

* fix(groq/chat/transformation.py): handle max_retries

---------

Co-authored-by: Adrian Lyjak <adrian@chatmeter.com>
2025-04-15 22:29:25 -07:00
Krish Dholakia
d3e7a137ad
Revert "fix #9783: Retain schema field ordering for google gemini and vertex …" (#10038)
This reverts commit e3729f9855.
2025-04-15 19:21:33 -07:00
Adrian Lyjak
e3729f9855
fix #9783: Retain schema field ordering for google gemini and vertex (#9828) 2025-04-15 19:12:02 -07:00
Krish Dholakia
6b5f093087
Revert "Fix case where only system messages are passed to Gemini (#9992)" (#10027)
This reverts commit 2afd922f8c.
2025-04-15 13:34:03 -07:00
Nolan Tremelling
2afd922f8c
Fix case where only system messages are passed to Gemini (#9992) 2025-04-15 13:30:49 -07:00
Krish Dholakia
78879c68a9
Revert avglogprobs change + Add azure/gpt-4o-realtime-audio cost tracking (#9893)
* test: initial commit fixing gemini logprobs

Fixes https://github.com/BerriAI/litellm/issues/9888

* fix(vertex_and_google_ai_studio.py): Revert avglogprobs change

Fixes https://github.com/BerriAI/litellm/issues/8890

* build(model_prices_and_context_window.json): add gpt-4o-realtime-preview cost to model cost map

Fixes https://github.com/BerriAI/litellm/issues/9814

* test: add cost calculation unit testing

* test: fix test

* test: update test
2025-04-10 21:23:55 -07:00
Krish Dholakia
e1f7bcb47d
Fix VertexAI Credential Caching issue (#9756)
* refactor(vertex_llm_base.py): Prevent credential misrouting for projects

Fixes https://github.com/BerriAI/litellm/issues/7904

* fix: passing unit tests

* fix(vertex_llm_base.py): common auth logic across sync + async vertex ai calls

prevents credential caching issue across both flows

* test: fix test

* fix(vertex_llm_base.py): handle project id in default cause

* fix(factory.py): don't pass cache control if not set

bedrock invoke does not support this

* test: fix test

* fix(vertex_llm_base.py): add .exception message in load_auth

* fix: fix ruff error
2025-04-04 16:38:08 -07:00
sajda
4a4328b5bb
fix:Gemini Flash 2.0 implementation is not returning the logprobs (#9713)
* fix:Gemini Flash 2.0 implementation is not returning the logprobs

* fix: linting error by adding a helper method called _process_candidates
2025-04-03 11:53:41 -07:00
Krish Dholakia
46b3dbde8f
Revert "fix: Anthropic prompt caching on GCP Vertex AI (#9605)" (#9670)
This reverts commit a8673246dc.
2025-03-31 17:13:55 -07:00
Sam
a8673246dc
fix: Anthropic prompt caching on GCP Vertex AI (#9605)
* fix: Anthropic prompt caching on GCP Vertex AI

* test(vertex): anthropic prompt caching
2025-03-29 23:40:34 -07:00
Krish Dholakia
5ac61a7572
Add bedrock latency optimized inference support (#9623)
* fix(converse_transformation.py): add performanceConfig param support on bedrock

Closes https://github.com/BerriAI/litellm/issues/7606

* fix(converse_transformation.py): refactor to use more flexible single getter for params which are separate config blocks

* test(test_main.py): add e2e mock test for bedrock performance config

* build(model_prices_and_context_window.json): add versioned multimodal embedding

* refactor(multimodal_embeddings/): migrate to config pattern

* feat(vertex_ai/multimodalembeddings): calculate usage for multimodal embedding calls

Enables cost calculation for multimodal embeddings

* feat(vertex_ai/multimodalembeddings): get usage object for embedding calls

ensures accurate cost tracking for vertexai multimodal embedding calls

* fix(embedding_handler.py): remove unused imports

* fix: fix linting errors

* fix: handle response api usage calculation

* test(test_vertex_ai_multimodal_embedding_transformation.py): update tests

* test: mark flaky test

* feat(vertex_ai/multimodal_embeddings/transformation.py): support text+image+video input

* docs(vertex.md): document sending text + image to vertex multimodal embeddings

* test: remove incorrect file

* fix(multimodal_embeddings/transformation.py): fix linting error

* style: remove unused import
2025-03-29 00:23:09 -07:00
Nicholas Grabar
09daeac188 Rebasing 2 2025-03-28 15:18:09 -07:00
Nicholas Grabar
06a45706b2 Rebase 3 2025-03-28 15:18:05 -07:00
Nicholas Grabar
1f2bbda11d Add recursion depth to convert_anyof_null_to_nullable, constants.py. Fix recursive_detector.py raise error state 2025-03-28 13:11:19 -07:00
NickGrab
b72fbdde74
Merge branch 'main' into litellm_8864-feature-vertex-anyOf-support 2025-03-28 10:25:04 -07:00
Ishaan Jaff
0aae9aa24a rename _is_model_gemini_spec_model 2025-03-26 14:28:26 -07:00
Ishaan Jaff
c38b41f65b test_get_supports_system_message 2025-03-26 14:26:08 -07:00
Ishaan Jaff
72f08bc6ea unit tests for VertexGeminiConfig 2025-03-26 14:21:35 -07:00
Nicholas Grabar
f68cc26f15 8864 Add support for anyOf union type while handling null fields 2025-03-25 22:37:28 -07:00
Krish Dholakia
92883560f0
fix vertex ai multimodal embedding translation (#9471)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 20s
Helm unit test / unit-test (push) Successful in 24s
* remove data:image/jpeg;base64, prefix from base64 image input

vertex_ai's multimodal embeddings endpoint expects a raw base64 string without `data:image/jpeg;base64,` prefix.

* Add Vertex Multimodal Embedding Test

* fix(test_vertex.py): add e2e tests on multimodal embeddings

* test: unit testing

* test: remove sklearn dep

* test: update test with fixed route

* test: fix test

---------

Co-authored-by: Jonarod <jonrodd@gmail.com>
Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com>
2025-03-24 23:23:28 -07:00
Krish Dholakia
a619580bf8
Add vertexai topLogprobs support (#9518)
* Added support for top_logprobs in vertex gemini models

* Testing for top_logprobs feature in vertexai

* Update litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py

Co-authored-by: Tom Matthews <tomukmatthews@gmail.com>

* refactor(tests/): refactor testing to be in correct repo

---------

Co-authored-by: Aditya Thaker <adityathaker28@gmail.com>
Co-authored-by: Tom Matthews <tomukmatthews@gmail.com>
2025-03-24 22:42:38 -07:00
Krrish Dholakia
3ce3689282 test: migrate testing 2025-03-22 12:48:53 -07:00
sven
2d4e58cf23 add test code 2025-03-13 14:00:12 +09:00