litellm/litellm/batch_completion
Ishaan Jaff 4d1b4beb3d
(refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208)
* use folder for caching

* fix importing caching

* fix clickhouse pyright

* fix linting

* fix correctly pass kwargs and args

* fix test case for embedding

* fix linting

* fix embedding caching logic

* fix refactor handle utils.py

* fix test_embedding_caching_azure_individual_items_reordered
2024-10-14 16:34:01 +05:30
..
main.py (refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208) 2024-10-14 16:34:01 +05:30
Readme.md (fix) batch_completion fails with bedrock due to extraneous [max_workers] key (#6176) 2024-10-12 14:10:24 +05:30

Implementation of litellm.batch_completion, litellm.batch_completion_models, litellm.batch_completion_models_all_responses

Doc: https://docs.litellm.ai/docs/completion/batching

LiteLLM Python SDK allows you to:

  1. litellm.batch_completion Batch litellm.completion function for a given model.
  2. litellm.batch_completion_models Send a request to multiple language models concurrently and return the response as soon as one of the models responds.
  3. litellm.batch_completion_models_all_responses Send a request to multiple language models concurrently and return a list of responses from all models that respond.