forked from phoenix/litellm-mirror
33 lines
No EOL
935 B
Markdown
33 lines
No EOL
935 B
Markdown
# Streaming Responses & Async Completion
|
|
|
|
- [Streaming Responses](#streaming-responses)
|
|
- [Async Completion](#async-completion)
|
|
|
|
## Streaming Responses
|
|
LiteLLM supports streaming the model response back by passing `stream=True` as an argument to the completion function
|
|
### Usage
|
|
```python
|
|
response = completion(model="gpt-3.5-turbo", messages=messages, stream=True)
|
|
for chunk in response:
|
|
print(chunk['choices'][0]['delta'])
|
|
|
|
```
|
|
|
|
## Async Completion
|
|
Asynchronous Completion with LiteLLM
|
|
LiteLLM provides an asynchronous version of the completion function called `acompletion`
|
|
### Usage
|
|
```
|
|
from litellm import acompletion
|
|
import asyncio
|
|
|
|
async def test_get_response():
|
|
user_message = "Hello, how are you?"
|
|
messages = [{"content": user_message, "role": "user"}]
|
|
response = await acompletion(model="gpt-3.5-turbo", messages=messages)
|
|
return response
|
|
|
|
response = asyncio.run(test_get_response())
|
|
print(response)
|
|
|
|
``` |