This commit is contained in:
ishaan-jaff 2023-09-19 13:29:12 -07:00
parent 3b1c7b3457
commit 5f599fff95

View file

@ -1,4 +1,9 @@
# Batching Completion() Calls
LiteLLM allows you to:
* Send multiple completion calls to 1 model
* Send 1 completion call to N models
## Send multiple completion calls to 1 model
In the batch_completion method, you provide a list of `messages` where each sub-list of messages is passed to `litellm.completion()`, allowing you to process multiple prompts efficiently in a single API call.
@ -6,7 +11,7 @@ In the batch_completion method, you provide a list of `messages` where each sub-
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>
## Example Code
### Example Code
```python
import litellm
import os
@ -32,4 +37,53 @@ responses = batch_completion(
]
]
)
```
## Send 1 completion call to N models
This makes parallel calls to the specified `models` and returns the first response
Use this to reduce latency
### Example Code
```python
import litellm
import os
from litellm import batch_completion_models
os.environ['ANTHROPIC_API_KEY'] = ""
os.environ['OPENAI_API_KEY'] = ""
os.environ['COHERE_API_KEY'] = ""
response = batch_completion_models(
models=["gpt-3.5-turbo", "claude-instant-1.2", "command-nightly"],
messages=[{"role": "user", "content": "Hey, how's it going"}]
)
print(result)
```
### Output
Returns the first response
```json
{
"object": "chat.completion",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": " I'm doing well, thanks for asking! I'm an AI assistant created by Anthropic to be helpful, harmless, and honest.",
"role": "assistant",
"logprobs": null
}
}
],
"id": "chatcmpl-23273eed-e351-41be-a492-bafcf5cf3274",
"created": 1695154628.2076092,
"model": "command-nightly",
"usage": {
"prompt_tokens": 6,
"completion_tokens": 14,
"total_tokens": 20
}
}
```