Commit graph

2174 commits

Author SHA1 Message Date
Ishaan Jaff
01b16c36a0 Merge branch 'main' into litellm_anthropic_messages_improvements 2025-03-31 14:22:56 -07:00
Sam
190c31b73e fix: Anthropic prompt caching on GCP Vertex AI (#9605)
* fix: Anthropic prompt caching on GCP Vertex AI

* test(vertex): anthropic prompt caching
2025-03-29 23:40:34 -07:00
Krish Dholakia
70f993d3d7 Add gemini audio input support + handle special tokens in sagemaker response (#9640)
* fix(internal_user_endpoints.py): cleanup unused variables on beta endpoint

no team/org split on daily user endpoint

* build(model_prices_and_context_window.json): gemini-2.0-flash supports audio input

* feat(gemini/transformation.py): support passing audio input to gemini

* test: fix test

* fix(gemini/transformation.py): support audio input as a url

enables passing google cloud bucket urls

* fix(gemini/transformation.py): support explicitly passing format of file

* fix(gemini/transformation.py): expand support for inferred file types from url

* fix(sagemaker/completion/transformation.py): fix special token error when counting sagemaker tokens

* test: fix import
2025-03-29 19:23:09 -07:00
Krish Dholakia
d7b294dd0a build(pyproject.toml): add new dev dependencies - for type checking (#9631)
* build(pyproject.toml): add new dev dependencies - for type checking

* build: reformat files to fit black

* ci: reformat to fit black

* ci(test-litellm.yml): make tests run clear

* build(pyproject.toml): add ruff

* fix: fix ruff checks

* build(mypy/): fix mypy linting errors

* fix(hashicorp_secret_manager.py): fix passing cert for tls auth

* build(mypy/): resolve all mypy errors

* test: update test

* fix: fix black formatting

* build(pre-commit-config.yaml): use poetry run black

* fix(proxy_server.py): fix linting error

* fix: fix ruff safe representation error
2025-03-29 11:02:13 -07:00
Krish Dholakia
308a2fb195 Add bedrock latency optimized inference support (#9623)
* fix(converse_transformation.py): add performanceConfig param support on bedrock

Closes https://github.com/BerriAI/litellm/issues/7606

* fix(converse_transformation.py): refactor to use more flexible single getter for params which are separate config blocks

* test(test_main.py): add e2e mock test for bedrock performance config

* build(model_prices_and_context_window.json): add versioned multimodal embedding

* refactor(multimodal_embeddings/): migrate to config pattern

* feat(vertex_ai/multimodalembeddings): calculate usage for multimodal embedding calls

Enables cost calculation for multimodal embeddings

* feat(vertex_ai/multimodalembeddings): get usage object for embedding calls

ensures accurate cost tracking for vertexai multimodal embedding calls

* fix(embedding_handler.py): remove unused imports

* fix: fix linting errors

* fix: handle response api usage calculation

* test(test_vertex_ai_multimodal_embedding_transformation.py): update tests

* test: mark flaky test

* feat(vertex_ai/multimodal_embeddings/transformation.py): support text+image+video input

* docs(vertex.md): document sending text + image to vertex multimodal embeddings

* test: remove incorrect file

* fix(multimodal_embeddings/transformation.py): fix linting error

* style: remove unused import
2025-03-29 00:23:09 -07:00
Krish Dholakia
0d1951f019 fix(openrouter/chat/transformation.py): raise informative message for openrouter key error (#9626)
Related Issue: https://github.com/Aider-AI/aider/issues/3550#issuecomment-2763052355
2025-03-28 20:24:28 -07:00
NickGrab
70cdc9fc50 Merge pull request #9625 from BerriAI/litellm_mar_28_vertex_fix
Add support to Vertex AI transformation for anyOf union type with null fields
2025-03-28 16:09:29 -07:00
Krish Dholakia
5f8859eda8 Fix anthropic thinking + response_format (#9594)
* fix(anthropic/chat/transformation.py): Don't set tool choice on response_format conversion when thinking is enabled

Not allowed by Anthropic

Fixes https://github.com/BerriAI/litellm/issues/8901

* refactor: move test to base anthropic chat tests

ensures consistent behaviour across vertex/anthropic/bedrock

* fix(anthropic/chat/transformation.py): if thinking token is specified and max tokens is not - ensure max token to anthropic is higher than thinking tokens

* feat(converse_transformation.py): correctly handle thinking + response format on Bedrock Converse

Fixes https://github.com/BerriAI/litellm/issues/8901

* fix(converse_transformation.py): correctly handle adding max tokens

* test: handle service unavailable error
2025-03-28 15:57:40 -07:00
Nicholas Grabar
36458f5eb5 Add recursion depth to convert_anyof_null_to_nullable, constants.py. Fix recursive_detector.py raise error state 2025-03-28 13:11:19 -07:00
Krish Dholakia
8a5b0b6218 fix(proxy_server.py): get master key from environment, if not set in … (#9617)
* fix(proxy_server.py): get master key from environment, if not set in general settings or general settings not set at all

* test: mark flaky test

* test(test_proxy_server.py): mock prisma client

* ci: add new github workflow for testing just the mock tests

* fix: fix linting error

* ci(conftest.py): add conftest.py to isolate proxy tests

* build(pyproject.toml): add respx to dev dependencies

* build(pyproject.toml): add prisma to dev dependencies

* test: fix mock prompt management tests to use a mock anthropic key

* ci(test-litellm.yml): parallelize mock testing

make it run faster

* build(pyproject.toml): add hypercorn as dev dep

* build(pyproject.toml): separate proxy vs. core dev dependencies

make it easier for non-proxy contributors to run tests locally - e.g. no need to install hypercorn

* ci(test-litellm.yml): pin python version

* test(test_rerank.py): move test - cannot be mocked, requires aws credentials for e2e testing

* ci: add thank you message to ci

* test: add mock env var to test

* test: add autouse to tests

* test: test mock env vars for e2e tests
2025-03-28 12:32:04 -07:00
Ishaan Jaff
91564056a5 fix code quality check 2025-03-28 10:32:39 -07:00
NickGrab
0e464539b1 Merge branch 'main' into litellm_8864-feature-vertex-anyOf-support 2025-03-28 10:25:04 -07:00
Krrish Dholakia
9a44e77046 fix: fix python38 linting error 2025-03-28 09:38:32 -07:00
Krrish Dholakia
d21631d7f4 fix(common_utils.py): fix linting error 2025-03-27 23:31:58 -07:00
Krish Dholakia
3543b2a808 Support discovering gemini, anthropic, xai models by calling their /v1/model endpoint (#9530)
* fix: initial commit for adding provider model discovery to gemini

* feat(gemini/): add model discovery for gemini/ route

* docs(set_keys.md): update docs to show you can check available gemini models as well

* feat(anthropic/): add model discovery for anthropic api key

* feat(xai/): add model discovery for XAI

enables checking what models an xai key can call

* ci: bump ci config yml

* fix(topaz/common_utils.py): fix linting error

* fix: fix linting error for python38
2025-03-27 22:50:48 -07:00
Krish Dholakia
ff4419e5ee fix(mistral_chat_transformation.py): add missing comma (#9606) 2025-03-27 22:16:21 -07:00
Krish Dholakia
fe3a97eeaa Revert "Support max_completion_tokens on Mistral (#9589)" (#9604)
This reverts commit fef5d23dd5.
2025-03-27 19:14:26 -07:00
Chris Mancuso
3520ae318f Support max_completion_tokens on Mistral (#9589)
* Support max_completion_tokens on Mistral

* test fix
2025-03-27 17:27:19 -07:00
Krish Dholakia
d58fe5a9f9 Add OpenAI gpt-4o-transcribe support (#9517)
* refactor: introduce new transformation config for gpt-4o-transcribe models

* refactor: expose new transformation configs for audio transcription

* ci: fix config yml

* feat(openai/transcriptions): support provider config transformation on openai audio transcriptions

allows gpt-4o and whisper audio transformation to work as expected

* refactor: migrate fireworks ai + deepgram to new transform request pattern

* feat(openai/): working support for gpt-4o-audio-transcribe

* build(model_prices_and_context_window.json): add gpt-4o-transcribe to model cost map

* build(model_prices_and_context_window.json): specify what endpoints are supported for `/audio/transcriptions`

* fix(get_supported_openai_params.py): fix return

* refactor(deepgram/): migrate unit test to deepgram handler

* refactor: cleanup unused imports

* fix(get_supported_openai_params.py): fix linting error

* test: update test
2025-03-26 23:10:25 -07:00
Ishaan Jaff
cb8bb12953 fixes - anthropic messages interface 2025-03-26 17:45:47 -07:00
Ishaan Jaff
c25f61b7ca working anthropic API tests 2025-03-26 17:34:41 -07:00
Krish Dholakia
9c083e7d2c Support Gemini audio token cost tracking + fix openai audio input token cost tracking (#9535)
* fix(vertex_and_google_ai_studio_gemini.py): log gemini audio tokens in usage object

enables accurate cost tracking

* refactor(vertex_ai/cost_calculator.py): refactor 128k+ token cost calculation to only run if model info has it

Google has moved away from this for gemini-2.0 models

* refactor(vertex_ai/cost_calculator.py): migrate to usage object for more flexible data passthrough

* fix(llm_cost_calc/utils.py): support audio token cost tracking in generic cost per token

enables vertex ai cost tracking to work with audio tokens

* fix(llm_cost_calc/utils.py): default to total prompt tokens if text tokens field not set

* refactor(llm_cost_calc/utils.py): move openai cost tracking to generic cost per token

more consistent behaviour across providers

* test: add unit test for gemini audio token cost calculation

* ci: bump ci config

* test: fix test
2025-03-26 17:26:25 -07:00
Ishaan Jaff
6bae7c8889 fix anthropic_messages 2025-03-26 17:21:14 -07:00
Ishaan Jaff
c030c88b50 fix anthropic_messages implementation 2025-03-26 17:12:40 -07:00
Ishaan Jaff
3b93b0da15 define types for response form AnthropicMessagesResponse 2025-03-26 16:54:45 -07:00
Ishaan Jaff
147548f151 Merge pull request #9542 from BerriAI/litellm_fix_vertex_ai_ft_models
[Feature]: Support for Fine-Tuned Vertex AI LLMs
2025-03-26 16:19:41 -07:00
Ishaan Jaff
f77fcbcad3 rename _is_model_gemini_spec_model 2025-03-26 14:28:26 -07:00
Ishaan Jaff
826deea6fb test_gemini_fine_tuned_model_request_consistency 2025-03-26 14:18:11 -07:00
Ishaan Jaff
e291c10b4f _get_model_name_from_gemini_spec_model 2025-03-26 12:16:18 -07:00
Krish Dholakia
7873080223 Nova Canvas complete image generation tasks (#9177) (#9525)
* Nova Canvas complete image generation tasks (#9177)

* add initial support for Amazon Nova Canvas model

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* adjust name to AmazonNovaCanvas and map function variables to config

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* tighten model name check

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* fix quality mapping

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* add premium quality in config

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* support all Amazon Nova Canvas tasks

* remove unused import

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* add tests for image generation tasks and fix payload

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* add missing util file

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* update model prices backup file

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* remove image tasks other than text->image

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* add color guided generation task for Nova Canvas

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* fix merge

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* add nova canvas image generation documentation

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

* add nova canvas unit tests

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

---------

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>

* ci(config.yml): bump ci config

* test: fix test

---------

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>
Co-authored-by: omrishiv <327609+omrishiv@users.noreply.github.com>
2025-03-26 11:28:20 -07:00
Ishaan Jaff
654536b561 rename _is_model_gemini_spec_model 2025-03-26 11:14:51 -07:00
Ishaan Jaff
0767a3cc88 Merge branch 'main' into litellm_fix_vertex_ai_ft_models 2025-03-26 11:11:54 -07:00
Ishaan Jaff
166ec84c01 undo code changes 2025-03-26 10:57:08 -07:00
Ishaan Jaff
172c9d5689 _is_model_gemini_gemini_spec_model 2025-03-26 10:53:23 -07:00
Ishaan Jaff
59a773920e _transform_request_body 2025-03-26 00:05:45 -07:00
Krish Dholakia
1ab5d49132 Support litellm.api_base for vertex_ai + gemini/ across completion, embedding, image_generation (#9516)
* test(tests): add unit testing for litellm_proxy integration

* fix(cost_calculator.py): fix tracking cost in sdk when calling proxy

* fix(main.py): respect litellm.api_base on `vertex_ai/` and `gemini/` routes

* fix(main.py): consistently support custom api base across gemini + vertexai on embedding + completion

* feat(vertex_ai/): test

* fix: fix linting error

* test: set api base as None before starting loadtest
2025-03-25 23:46:20 -07:00
Nicholas Grabar
1d2df77a9d 8864 Add support for anyOf union type while handling null fields 2025-03-25 22:37:28 -07:00
Krish Dholakia
6a40d50bc5 fix vertex ai multimodal embedding translation (#9471)
* remove data:image/jpeg;base64, prefix from base64 image input

vertex_ai's multimodal embeddings endpoint expects a raw base64 string without `data:image/jpeg;base64,` prefix.

* Add Vertex Multimodal Embedding Test

* fix(test_vertex.py): add e2e tests on multimodal embeddings

* test: unit testing

* test: remove sklearn dep

* test: update test with fixed route

* test: fix test

---------

Co-authored-by: Jonarod <jonrodd@gmail.com>
Co-authored-by: Emerson Gomes <emerson.gomes@thalesgroup.com>
2025-03-24 23:23:28 -07:00
Krish Dholakia
71848566c5 Add vertexai topLogprobs support (#9518)
* Added support for top_logprobs in vertex gemini models

* Testing for top_logprobs feature in vertexai

* Update litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py

Co-authored-by: Tom Matthews <tomukmatthews@gmail.com>

* refactor(tests/): refactor testing to be in correct repo

---------

Co-authored-by: Aditya Thaker <adityathaker28@gmail.com>
Co-authored-by: Tom Matthews <tomukmatthews@gmail.com>
2025-03-24 22:42:38 -07:00
Krrish Dholakia
017a9b882a fix(invoke_handler.py): remove hard code 2025-03-24 17:58:26 -07:00
Krrish Dholakia
851c993063 fix(vertex_ai/common_utils.py): fix handling constructed url with default vertex config 2025-03-22 11:32:01 -07:00
Krrish Dholakia
e99a22b9ac refactor(llm_passthrough_endpoints.py): refactor vertex passthrough to use common llm passthrough handler.py 2025-03-22 10:42:46 -07:00
Krrish Dholakia
b6b6d66882 test: mock sagemaker tests 2025-03-21 16:21:18 -07:00
Krrish Dholakia
f6981606f7 fix: fix linting error 2025-03-21 12:20:21 -07:00
Krrish Dholakia
b3b93dbd14 fix(anthropic/chat/transformation.py): correctly update response_format to tool call transformation
Fixes https://github.com/BerriAI/litellm/issues/9411
2025-03-21 10:20:21 -07:00
Ishaan Jaff
6feb83eb51 Merge pull request #9419 from BerriAI/litellm_streaming_o1_pro
[Feat] OpenAI o1-pro Responses API streaming support
2025-03-20 21:54:43 -07:00
Krish Dholakia
bc03378fef Merge pull request #9260 from Grizzly-jobs/fix/voyage-ai-token-usage-tracking
fix: VoyageAI `prompt_token` always empty
2025-03-20 14:00:51 -07:00
Ishaan Jaff
3088204ac2 fix code quality checks 2025-03-20 13:57:35 -07:00
Krish Dholakia
f5f92bf6ae Merge pull request #9366 from JamesGuthrie/jg/vertex-output-dimensionality
fix: VertexAI outputDimensionality configuration
2025-03-20 13:55:33 -07:00
Ishaan Jaff
6d4cf6581d MockResponsesAPIStreamingIterator 2025-03-20 12:30:09 -07:00