mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-03 18:00:36 +00:00
Implementats usage accumulation to StreamingResponseOrchestrator.
The most important part was to pass `stream_options = { "include_usage":
true }` to the chat_completion call. This means I will have to record
all responses tests again because request hash will change :)
Test changes:
- Add usage assertions to streaming and non-streaming tests
- Update test recordings with actual usage data from OpenAI
|
||
|---|---|---|
| .. | ||
| __init__.py | ||
| openai_responses.py | ||
| streaming.py | ||
| tool_executor.py | ||
| types.py | ||
| utils.py | ||