llama-stack-mirror/llama_stack/apis/agents
Ashwin Bharambe c92a1c99f0 feat(responses): add usage types to inference and responses APIs
Add OpenAI-compatible usage tracking types:
- OpenAIChatCompletionUsage with prompt/completion token counts
- OpenAIResponseUsage with input/output token counts
- Token detail types for cached_tokens and reasoning_tokens
- Add usage field to chat completion and response objects

This enables reporting token consumption for both streaming and
non-streaming responses, matching OpenAI's usage reporting format.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-09 21:12:29 -07:00
..
__init__.py chore: remove nested imports (#2515) 2025-06-26 08:01:05 +05:30
agents.py docs: API docstrings cleanup for better documentation rendering (#3661) 2025-10-06 10:46:33 -07:00
openai_responses.py feat(responses): add usage types to inference and responses APIs 2025-10-09 21:12:29 -07:00