mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-11 13:44:38 +00:00
Implementats usage accumulation to StreamingResponseOrchestrator. The most important part was to pass `stream_options = { "include_usage": true }` to the chat_completion call. This means I will have to record all responses tests again because request hash will change :) Test changes: - Add usage assertions to streaming and non-streaming tests - Update test recordings with actual usage data from OpenAI |
||
---|---|---|
.. | ||
meta_reference | ||
__init__.py |