feat(responses): implement usage tracking in streaming responses (#3771)

Implementats usage accumulation to StreamingResponseOrchestrator. 

The most important part was to pass `stream_options = { "include_usage":
true }` to the chat_completion call. This means I will have to record
all responses tests again because request hash will change :)

Test changes:
- Add usage assertions to streaming and non-streaming tests
- Update test recordings with actual usage data from OpenAI
This commit is contained in:
Ashwin Bharambe 2025-10-10 12:27:03 -07:00 committed by GitHub
parent e7d21e1ee3
commit 1394403360
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
21 changed files with 15099 additions and 612 deletions

View file

@ -167,6 +167,9 @@ async def test_create_openai_response_with_string_input(openai_responses_impl, m
tools=None,
stream=True,
temperature=0.1,
stream_options={
"include_usage": True,
},
)
# Should have content part events for text streaming