Add bedrock latency optimized inference support (#9623)

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-26 03:04:13 +00:00

* fix(converse_transformation.py): add performanceConfig param support on bedrock

Closes https://github.com/BerriAI/litellm/issues/7606

* fix(converse_transformation.py): refactor to use more flexible single getter for params which are separate config blocks

* test(test_main.py): add e2e mock test for bedrock performance config

* build(model_prices_and_context_window.json): add versioned multimodal embedding

* refactor(multimodal_embeddings/): migrate to config pattern

* feat(vertex_ai/multimodalembeddings): calculate usage for multimodal embedding calls

Enables cost calculation for multimodal embeddings

* feat(vertex_ai/multimodalembeddings): get usage object for embedding calls

ensures accurate cost tracking for vertexai multimodal embedding calls

* fix(embedding_handler.py): remove unused imports

* fix: fix linting errors

* fix: handle response api usage calculation

* test(test_vertex_ai_multimodal_embedding_transformation.py): update tests

* test: mark flaky test

* feat(vertex_ai/multimodal_embeddings/transformation.py): support text+image+video input

* docs(vertex.md): document sending text + image to vertex multimodal embeddings

* test: remove incorrect file

* fix(multimodal_embeddings/transformation.py): fix linting error

* style: remove unused import

This commit is contained in:

Krish Dholakia

2025-03-29 00:23:09 -07:00

• committed by

GitHub

parent 0742e6afd6

commit 5ac61a7572

No known key found for this signature in database

GPG key ID: B5690EEEBB952194

19 changed files with 806 additions and 245 deletions

									
										1

litellm/main.py
									
										View file
										
				@ -3723,6 +3723,7 @@ def embedding(  # noqa: PLR0915

				                    encoding=encoding,

				                    logging_obj=logging,

				                    optional_params=optional_params,

				                    litellm_params=litellm_params_dict,

				                    model_response=EmbeddingResponse(),

				                    vertex_project=vertex_ai_project,

				                    vertex_location=vertex_ai_location,

Rows
Columns

Add bedrock latency optimized inference support (#9623)

1 litellm/main.py Unescape Escape View file

1

litellm/main.py

View file