forked from phoenix/litellm-mirror
docs
This commit is contained in:
parent
3b1c7b3457
commit
5f599fff95
1 changed files with 55 additions and 1 deletions
|
@ -1,4 +1,9 @@
|
||||||
# Batching Completion() Calls
|
# Batching Completion() Calls
|
||||||
|
LiteLLM allows you to:
|
||||||
|
* Send multiple completion calls to 1 model
|
||||||
|
* Send 1 completion call to N models
|
||||||
|
|
||||||
|
## Send multiple completion calls to 1 model
|
||||||
|
|
||||||
In the batch_completion method, you provide a list of `messages` where each sub-list of messages is passed to `litellm.completion()`, allowing you to process multiple prompts efficiently in a single API call.
|
In the batch_completion method, you provide a list of `messages` where each sub-list of messages is passed to `litellm.completion()`, allowing you to process multiple prompts efficiently in a single API call.
|
||||||
|
|
||||||
|
@ -6,7 +11,7 @@ In the batch_completion method, you provide a list of `messages` where each sub-
|
||||||
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
||||||
</a>
|
</a>
|
||||||
|
|
||||||
## Example Code
|
### Example Code
|
||||||
```python
|
```python
|
||||||
import litellm
|
import litellm
|
||||||
import os
|
import os
|
||||||
|
@ -32,4 +37,53 @@ responses = batch_completion(
|
||||||
]
|
]
|
||||||
]
|
]
|
||||||
)
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Send 1 completion call to N models
|
||||||
|
This makes parallel calls to the specified `models` and returns the first response
|
||||||
|
|
||||||
|
Use this to reduce latency
|
||||||
|
|
||||||
|
### Example Code
|
||||||
|
```python
|
||||||
|
import litellm
|
||||||
|
import os
|
||||||
|
from litellm import batch_completion_models
|
||||||
|
|
||||||
|
os.environ['ANTHROPIC_API_KEY'] = ""
|
||||||
|
os.environ['OPENAI_API_KEY'] = ""
|
||||||
|
os.environ['COHERE_API_KEY'] = ""
|
||||||
|
|
||||||
|
response = batch_completion_models(
|
||||||
|
models=["gpt-3.5-turbo", "claude-instant-1.2", "command-nightly"],
|
||||||
|
messages=[{"role": "user", "content": "Hey, how's it going"}]
|
||||||
|
)
|
||||||
|
print(result)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Output
|
||||||
|
Returns the first response
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"object": "chat.completion",
|
||||||
|
"choices": [
|
||||||
|
{
|
||||||
|
"finish_reason": "stop",
|
||||||
|
"index": 0,
|
||||||
|
"message": {
|
||||||
|
"content": " I'm doing well, thanks for asking! I'm an AI assistant created by Anthropic to be helpful, harmless, and honest.",
|
||||||
|
"role": "assistant",
|
||||||
|
"logprobs": null
|
||||||
|
}
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"id": "chatcmpl-23273eed-e351-41be-a492-bafcf5cf3274",
|
||||||
|
"created": 1695154628.2076092,
|
||||||
|
"model": "command-nightly",
|
||||||
|
"usage": {
|
||||||
|
"prompt_tokens": 6,
|
||||||
|
"completion_tokens": 14,
|
||||||
|
"total_tokens": 20
|
||||||
|
}
|
||||||
|
}
|
||||||
```
|
```
|
Loading…
Add table
Add a link
Reference in a new issue