mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-24 16:57:21 +00:00
Implementats usage accumulation to StreamingResponseOrchestrator.
The most important part was to pass `stream_options = { "include_usage":
true }` to the chat_completion call. This means I will have to record
all responses tests again because request hash will change :)
Test changes:
- Add usage assertions to streaming and non-streaming tests
- Update test recordings with actual usage data from OpenAI
|
||
|---|---|---|
| .. | ||
| fixtures | ||
| recordings | ||
| __init__.py | ||
| helpers.py | ||
| streaming_assertions.py | ||
| test_basic_responses.py | ||
| test_conversation_responses.py | ||
| test_extra_body_shields.py | ||
| test_file_search.py | ||
| test_tool_responses.py | ||