fix: responses <> chat completion input conversion (#3645)

# What does this PR do? closes #3268 closes #3498 When resuming from previous response ID, currently we attempt to convert from the stored responses input to chat completion messages, which is not always possible, e.g. for tool calls where some data is lost once converted from chat completion message to repsonses input format. This PR stores the chat completion messages that correspond to the _last_ call to chat completion, which is sufficient to be resumed from in the next responses API call, where we load these saved messages and skip conversion entirely. Separate issue to optimize storage: https://github.com/llamastack/llama-stack/issues/3646 ## Test Plan existing CI tests
2025-10-04 04:04:14 +00:00 · 2025-10-02 16:01:08 -07:00 · 2025-10-02 16:01:08 -07:00 · cf422da825
commit cf422da825
parent 2e544ecd8a
7 changed files with 202 additions and 58 deletions
--- a/llama_stack/apis/agents/openai_responses.py
+++ b/llama_stack/apis/agents/openai_responses.py
@ -888,6 +888,10 @@ class OpenAIResponseObjectWithInput(OpenAIResponseObject):

    input: list[OpenAIResponseInput]

+    def to_response_object(self) -> OpenAIResponseObject:
+        """Convert to OpenAIResponseObject by excluding input field."""
+        return OpenAIResponseObject(**{k: v for k, v in self.model_dump().items() if k != "input"})
+

@json_schema_type
 class ListOpenAIResponseObject(BaseModel):