mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-05 02:17:31 +00:00
Implementats usage accumulation to StreamingResponseOrchestrator.
The most important part was to pass `stream_options = { "include_usage":
true }` to the chat_completion call. This means I will have to record
all responses tests again because request hash will change :)
Test changes:
- Add usage assertions to streaming and non-streaming tests
- Update test recordings with actual usage data from OpenAI
|
||
|---|---|---|
| .. | ||
| apis | ||
| cli | ||
| core | ||
| distributions | ||
| models | ||
| providers | ||
| strong_typing | ||
| testing | ||
| ui | ||
| __init__.py | ||
| env.py | ||
| log.py | ||
| schema_utils.py | ||