docs(stream.md): add streaming token usage info to docs

Closes https://github.com/BerriAI/litellm/issues/4904
This commit is contained in:
Krrish Dholakia 2024-07-26 10:51:17 -07:00
parent 9a6ed8cabb
commit b515d4f441

View file

@ -30,4 +30,48 @@ async def test_get_response():
response = asyncio.run(test_get_response())
print(response)
```
## Streaming Token Usage
Supported across all providers. Works the same as openai.
`stream_options={"include_usage": True}`
If set, an additional chunk will be streamed before the data: [DONE] message. The usage field on this chunk shows the token usage statistics for the entire request, and the choices field will always be an empty array. All other chunks will also include a usage field, but with a null value.
### SDK
```python
from litellm import completion
import os
os.environ["OPENAI_API_KEY"] = ""
response = completion(model="gpt-3.5-turbo", messages=messages, stream=True, stream_options={"include_usage": True})
for chunk in response:
print(chunk['choices'][0]['delta'])
```
### PROXY
```bash
curl https://0.0.0.0:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
],
"stream": true,
"stream_options": {"include_usage": true}
}'
```