feat: Add max_output_tokens to Response API

Responses and Completions have a max_output_tokens field. It is currently missing from the create and response object in Responses API. This PR fixes it. fixes: #3562 Signed-off-by: Abhishek Bongale <abhishekbongale@outlook.com>
2025-12-14 09:42:38 +00:00 · 2025-10-06 09:46:09 +01:00 · 2025-10-06 09:46:09 +01:00 · bb58da22a1
commit bb58da22a1
parent 92219fd8fb
14 changed files with 127 additions and 20 deletions
--- a/llama_stack/apis/agents/agents.py
+++ b/llama_stack/apis/agents/agents.py
@ -825,6 +825,7 @@ class Agents(Protocol):
                "List of shields to apply during response generation. Shields provide safety and content moderation."
            ),
        ] = None,
+        max_output_tokens: int | None = None,
    ) -> OpenAIResponseObject | AsyncIterator[OpenAIResponseObjectStream]:
        """Create a new OpenAI response.