mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-24 16:57:21 +00:00
Implementats usage accumulation to StreamingResponseOrchestrator. The most important part was to pass `stream_options = { "include_usage": true }` to the chat_completion call. This means I will have to record all responses tests again because request hash will change :) Test changes: - Add usage assertions to streaming and non-streaming tests - Update test recordings with actual usage data from OpenAI |
||
---|---|---|
.. | ||
fixtures | ||
recordings | ||
__init__.py | ||
helpers.py | ||
streaming_assertions.py | ||
test_basic_responses.py | ||
test_conversation_responses.py | ||
test_extra_body_shields.py | ||
test_file_search.py | ||
test_tool_responses.py |