feat(responses)!: implement support for OpenAI compatible prompts in Responses API (#3965)

# What does this PR do?  This PR is responsible for providing actual implementation of OpenAI compatible prompts in Responses API. This is the follow up PR with actual implementation after introducing #3942 The need of this functionality was initiated in #3514. > Note, https://github.com/llamastack/llama-stack/pull/3514 is divided on three separate PRs. Current PR is the third of three.   Closes #3321 ## Test Plan  Manual testing, CI workflow with added unit tests Comprehensive manual testing with new implementation: **Test Prompts with Images with text on them in Responses API:** I used this image for testing purposes: [iphone 17 image](https://github.com/user-attachments/assets/9e2ee821-e394-4bbd-b1c8-d48a3fa315de) 1. Upload an image: ``` curl -X POST http://localhost:8321/v1/files \ -H "Content-Type: multipart/form-data" \ -F "file=@/Users/ianmiller/iphone.jpeg" \ -F "purpose=assistants" ``` `{"object":"file","id":"file-d6d375f238e14f21952cc40246bc8504","bytes":556241,"created_at":1761750049,"expires_at":1793286049,"filename":"iphone.jpeg","purpose":"assistants"}%` 2. Create prompt: ``` curl -X POST http://localhost:8321/v1/prompts \ -H "Content-Type: application/json" \ -d '{ "prompt": "You are a product analysis expert. Analyze the following product:\n\nProduct Name: {{product_name}}\nDescription: {{description}}\n\nImage: {{product_photo}}\n\nProvide a detailed analysis including quality assessment, target audience, and pricing recommendations.", "variables": ["product_name", "description", "product_photo"] }' ``` `{"prompt":"You are a product analysis expert. Analyze the following product:\n\nProduct Name: {{product_name}}\nDescription: {{description}}\n\nImage: {{product_photo}}\n\nProvide a detailed analysis including quality assessment, target audience, and pricing recommendations.","version":1,"prompt_id":"pmpt_7be2208cb82cdbc35356354dae1f335d1e9b7baeca21ea62","variables":["product_name","description","product_photo"],"is_default":false}%` 3. Create response: ``` curl -X POST http://localhost:8321/v1/responses \ -H "Accept: application/json, text/event-stream" \ -H "Content-Type: application/json" \ -d '{ "input": "Please analyze this product", "model": "openai/gpt-4o", "store": true, "prompt": { "id": "pmpt_7be2208cb82cdbc35356354dae1f335d1e9b7baeca21ea62", "version": "1", "variables": { "product_name": { "type": "input_text", "text": "iPhone 17 Pro Max" }, "product_photo": { "type": "input_image", "file_id": "file-d6d375f238e14f21952cc40246bc8504", "detail": "high" } } } }' ``` `{"created_at":1761750427,"error":null,"id":"resp_f897f914-e3b8-4783-8223-3ed0d32fcbc6","model":"openai/gpt-4o","object":"response","output":[{"content":[{"text":"### Product Analysis: iPhone 17 Pro Max\n\n**Quality Assessment:**\n\n- **Display & Design:**\n - The 6.9-inch display is large, ideal for streaming and productivity.\n - Anti-reflective technology and 120Hz refresh rate enhance viewing experience, providing smoother visuals and reducing glare.\n - Titanium frame suggests a premium build, offering durability and a sleek appearance.\n\n- **Performance:**\n - The Apple A19 Pro chip promises significant performance improvements, likely leading to faster processing and efficient multitasking.\n - 12GB RAM is substantial for a smartphone, ensuring smooth operation for demanding apps and games.\n\n- **Camera System:**\n - The triple 48MP camera setup (wide, ultra-wide, telephoto) is designed for versatile photography needs, capturing high-resolution photos and videos.\n - The 24MP front camera will appeal to selfie enthusiasts and content creators needing quality front-facing shots.\n\n- **Connectivity:**\n - Wi-Fi 7 support indicates future-proof wireless capabilities, providing faster and more reliable internet connectivity.\n\n**Target Audience:**\n\n- **Tech Enthusiasts:** Individuals interested in cutting-edge technology and performance.\n- **Content Creators:** Users who need a robust camera system for photo and video production.\n- **Luxury Consumers:** Those who prefer premium materials and top-of-the-line specs.\n- **Professionals:** Users who require efficient multitasking and productivity features.\n\n**Pricing Recommendations:**\n\n- Given the premium specifications, a higher price point is expected. Consider pricing competitively within the high-end smartphone market while justifying cost through unique features like the titanium frame and advanced connectivity options.\n- Positioning around the $1,200 to $1,500 range would align with expectations for top-tier devices, catering to its target audience while ensuring profitability.\n\nOverall, the iPhone 17 Pro Max showcases a blend of innovative features and premium design, aimed at users seeking high performance and superior aesthetics.","type":"output_text","annotations":[]}],"role":"assistant","type":"message","id":"msg_66f4d844-4d9e-4102-80fc-eb75b34b6dbd","status":"completed"}],"parallel_tool_calls":false,"previous_response_id":null,"prompt":{"id":"pmpt_7be2208cb82cdbc35356354dae1f335d1e9b7baeca21ea62","variables":{"product_name":{"text":"iPhone 17 Pro Max","type":"input_text"},"product_photo":{"detail":"high","type":"input_image","file_id":"file-d6d375f238e14f21952cc40246bc8504","image_url":null}},"version":"1"},"status":"completed","temperature":null,"text":{"format":{"type":"text"}},"top_p":null,"tools":[],"truncation":null,"usage":{"input_tokens":830,"output_tokens":394,"total_tokens":1224,"input_tokens_details":{"cached_tokens":0},"output_tokens_details":{"reasoning_tokens":0}},"instructions":null}%` **Test Prompts with PDF files in Responses API:** I used this PDF file for testing purposes: [invoicesample.pdf](https://github.com/user-attachments/files/22958943/invoicesample.pdf) 1. Upload PDF: ``` curl -X POST http://localhost:8321/v1/files \ -H "Content-Type: multipart/form-data" \ -F "file=@/Users/ianmiller/invoicesample.pdf" \ -F "purpose=assistants" ``` `{"object":"file","id":"file-7fbb1043a4bb468cab60ffe4b8631d8e","bytes":149568,"created_at":1761750730,"expires_at":1793286730,"filename":"invoicesample.pdf","purpose":"assistants"}%` 2. Create prompt: ``` curl -X POST http://localhost:8321/v1/prompts \ -H "Content-Type: application/json" \ -d '{ "prompt": "You are an accounting and financial analysis expert. Analyze the following invoice document:\n\nInvoice Document: {{invoice_doc}}\n\nProvide a comprehensive analysis", "variables": ["invoice_doc"] }' ``` `{"prompt":"You are an accounting and financial analysis expert. Analyze the following invoice document:\n\nInvoice Document: {{invoice_doc}}\n\nProvide a comprehensive analysis","version":1,"prompt_id":"pmpt_72e2a184a86f32a568b6afb5455dca5c16bf3cc3f80092dc","variables":["invoice_doc"],"is_default":false}%` 3. Create response: ``` curl -X POST http://localhost:8321/v1/responses \ -H "Content-Type: application/json" \ -d '{ "input": "Please provide a detailed analysis of this invoice", "model": "openai/gpt-4o", "store": true, "prompt": { "id": "pmpt_72e2a184a86f32a568b6afb5455dca5c16bf3cc3f80092dc", "version": "1", "variables": { "invoice_doc": { "type": "input_file", "file_id": "file-7fbb1043a4bb468cab60ffe4b8631d8e", "filename": "invoicesample.pdf" } } } }' ``` `{"created_at":1761750881,"error":null,"id":"resp_da866913-db06-4702-8000-174daed9dbbb","model":"openai/gpt-4o","object":"response","output":[{"content":[{"text":"Here's a detailed analysis of the invoice provided:\n\n### Seller Information\n- **Business Name:** The invoice features a logo with \"Sunny Farm\" indicating the business identity.\n- **Address:** 123 Somewhere St, Melbourne VIC 3000\n- **Contact Information:** Phone number (03) 1234 5678\n\n### Buyer Information\n- **Name:** Denny Gunawan\n- **Address:** 221 Queen St, Melbourne VIC 3000\n\n### Transaction Details\n- **Invoice Number:** #20130304\n- **Date of Transaction:** Not explicitly mentioned, likely inferred from the invoice number or needs clarification.\n\n### Items Purchased\n1. **Apple**\n - Price: $5.00/kg\n - Quantity: 1 kg\n - Subtotal: $5.00\n\n2. **Orange**\n - Price: $1.99/kg\n - Quantity: 2 kg\n - Subtotal: $3.98\n\n3. **Watermelon**\n - Price: $1.69/kg\n - Quantity: 3 kg\n - Subtotal: $5.07\n\n4. **Mango**\n - Price: $9.56/kg\n - Quantity: 2 kg\n - Subtotal: $19.12\n\n5. **Peach**\n - Price: $2.99/kg\n - Quantity: 1 kg\n - Subtotal: $2.99\n\n### Financial Summary\n- **Subtotal for Items:** $36.00\n- **GST (Goods and Services Tax):** 10% of $36.00, which amounts to $3.60\n- **Total Amount Due:** $39.60\n\n### Notes\n- The invoice includes a placeholder text: \"Lorem ipsum dolor sit amet...\" which is typically used as filler text. This might indicate a section intended for terms, conditions, or additional notes that haven’t been completed.\n\n### Visual and Design Elements\n- The invoice uses a simple and clear layout, featuring the business logo prominently and stating essential information such as contact and transaction details in a structured manner.\n- There is a \"Thank You\" note at the bottom, which adds a professional and courteous touch.\n\n### Considerations\n- Ensure the date of the transaction is clear if there are any future references needed.\n- Replace filler text with relevant terms and conditions or any special instructions pertaining to the transaction.\n\nThis invoice appears standard, representing a small business transaction with clearly itemized products and applicable taxes.","type":"output_text","annotations":[]}],"role":"assistant","type":"message","id":"msg_39f3b39e-4684-4444-8e4d-e7395f88c9dc","status":"completed"}],"parallel_tool_calls":false,"previous_response_id":null,"prompt":{"id":"pmpt_72e2a184a86f32a568b6afb5455dca5c16bf3cc3f80092dc","variables":{"invoice_doc":{"type":"input_file","file_data":null,"file_id":"file-7fbb1043a4bb468cab60ffe4b8631d8e","file_url":null,"filename":"invoicesample.pdf"}},"version":"1"},"status":"completed","temperature":null,"text":{"format":{"type":"text"}},"top_p":null,"tools":[],"truncation":null,"usage":{"input_tokens":529,"output_tokens":513,"total_tokens":1042,"input_tokens_details":{"cached_tokens":0},"output_tokens_details":{"reasoning_tokens":0}},"instructions":null}%` **Test simple text Prompt in Responses API:** 1. Create prompt: ``` curl -X POST http://localhost:8321/v1/prompts \ -H "Content-Type: application/json" \ -d '{ "prompt": "Hello {{name}}! You are working at {{company}}. Your role is {{role}} at {{company}}. Remember, {{name}}, to be {{tone}}.", "variables": ["name", "company", "role", "tone"] }' ``` `{"prompt":"Hello {{name}}! You are working at {{company}}. Your role is {{role}} at {{company}}. Remember, {{name}}, to be {{tone}}.","version":1,"prompt_id":"pmpt_f340a3164a4f65d975c774ffe38ea42d15e7ce4a835919ef","variables":["name","company","role","tone"],"is_default":false}%` 2. Create response: ``` curl -X POST http://localhost:8321/v1/responses \ -H "Accept: application/json, text/event-stream" \ -H "Content-Type: application/json" \ -d '{ "input": "What is the capital of Ireland?", "model": "openai/gpt-4o", "store": true, "prompt": { "id": "pmpt_f340a3164a4f65d975c774ffe38ea42d15e7ce4a835919ef", "version": "1", "variables": { "name": { "type": "input_text", "text": "Alice" }, "company": { "type": "input_text", "text": "Dummy Company" }, "role": { "type": "input_text", "text": "Geography expert" }, "tone": { "type": "input_text", "text": "professional and helpful" } } } }' ``` `{"created_at":1761751097,"error":null,"id":"resp_1b037b95-d9ae-4ad0-8e76-d953897ecaef","model":"openai/gpt-4o","object":"response","output":[{"content":[{"text":"The capital of Ireland is Dublin.","type":"output_text","annotations":[]}],"role":"assistant","type":"message","id":"msg_8e7c72b6-2aa2-4da6-8e57-da4e12fa3ce2","status":"completed"}],"parallel_tool_calls":false,"previous_response_id":null,"prompt":{"id":"pmpt_f340a3164a4f65d975c774ffe38ea42d15e7ce4a835919ef","variables":{"name":{"text":"Alice","type":"input_text"},"company":{"text":"Dummy Company","type":"input_text"},"role":{"text":"Geography expert","type":"input_text"},"tone":{"text":"professional and helpful","type":"input_text"}},"version":"1"},"status":"completed","temperature":null,"text":{"format":{"type":"text"}},"top_p":null,"tools":[],"truncation":null,"usage":{"input_tokens":47,"output_tokens":7,"total_tokens":54,"input_tokens_details":{"cached_tokens":0},"output_tokens_details":{"reasoning_tokens":0}},"instructions":null}%`
2025-12-03 09:53:45 +00:00 · 2025-11-19 19:48:11 +00:00 · 2025-11-19 19:48:11 +00:00 · 0757d5a917
commit 0757d5a917
parent 8852666982
10 changed files with 770 additions and 17 deletions
--- a/src/llama_stack/providers/inline/agents/meta_reference/init.py
+++ b/src/llama_stack/providers/inline/agents/meta_reference/init.py
@ -27,8 +27,10 @@ async def get_provider_impl(
        deps[Api.tool_runtime],
        deps[Api.tool_groups],
        deps[Api.conversations],
-        policy,
+        deps[Api.prompts],
+        deps[Api.files],
        telemetry_enabled,
+        policy,
    )
    await impl.initialize()
    return impl
--- a/src/llama_stack/providers/inline/agents/meta_reference/agents.py
+++ b/src/llama_stack/providers/inline/agents/meta_reference/agents.py
@ -12,6 +12,7 @@ from llama_stack.providers.utils.responses.responses_store import ResponsesStore
 from llama_stack_api import (
    Agents,
    Conversations,
+    Files,
    Inference,
    ListOpenAIResponseInputItem,
    ListOpenAIResponseObject,
@ -22,6 +23,7 @@ from llama_stack_api import (
    OpenAIResponsePrompt,
    OpenAIResponseText,
    Order,
+    Prompts,
    ResponseGuardrail,
    Safety,
    ToolGroups,
@ -45,6 +47,8 @@ class MetaReferenceAgentsImpl(Agents):
        tool_runtime_api: ToolRuntime,
        tool_groups_api: ToolGroups,
        conversations_api: Conversations,
+        prompts_api: Prompts,
+        files_api: Files,
        policy: list[AccessRule],
        telemetry_enabled: bool = False,
    ):
@ -56,7 +60,8 @@ class MetaReferenceAgentsImpl(Agents):
        self.tool_groups_api = tool_groups_api
        self.conversations_api = conversations_api
        self.telemetry_enabled = telemetry_enabled
-
+        self.prompts_api = prompts_api
+        self.files_api = files_api
        self.in_memory_store = InmemoryKVStoreImpl()
        self.openai_responses_impl: OpenAIResponsesImpl | None = None
        self.policy = policy
@ -73,6 +78,8 @@ class MetaReferenceAgentsImpl(Agents):
            vector_io_api=self.vector_io_api,
            safety_api=self.safety_api,
            conversations_api=self.conversations_api,
+            prompts_api=self.prompts_api,
+            files_api=self.files_api,
        )

    async def shutdown(self) -> None:
--- a/src/llama_stack/providers/inline/agents/meta_reference/responses/openai_responses.py
+++ b/src/llama_stack/providers/inline/agents/meta_reference/responses/openai_responses.py
@ -4,6 +4,7 @@
 # This source code is licensed under the terms described in the LICENSE file in
 # the root directory of this source tree.

+import re
 import time
 import uuid
 from collections.abc import AsyncIterator
@ -18,13 +19,17 @@ from llama_stack.providers.utils.responses.responses_store import (
 from llama_stack_api import (
    ConversationItem,
    Conversations,
+    Files,
    Inference,
    InvalidConversationIdError,
    ListOpenAIResponseInputItem,
    ListOpenAIResponseObject,
+    OpenAIChatCompletionContentPartParam,
    OpenAIDeleteResponseObject,
    OpenAIMessageParam,
    OpenAIResponseInput,
+    OpenAIResponseInputMessageContentFile,
+    OpenAIResponseInputMessageContentImage,
    OpenAIResponseInputMessageContentText,
    OpenAIResponseInputTool,
    OpenAIResponseMessage,
@ -34,7 +39,9 @@ from llama_stack_api import (
    OpenAIResponseText,
    OpenAIResponseTextFormat,
    OpenAISystemMessageParam,
+    OpenAIUserMessageParam,
    Order,
+    Prompts,
    ResponseGuardrailSpec,
    Safety,
    ToolGroups,
@ -46,6 +53,7 @@ from .streaming import StreamingResponseOrchestrator
 from .tool_executor import ToolExecutor
 from .types import ChatCompletionContext, ToolContext
 from .utils import (
+    convert_response_content_to_chat_content,
    convert_response_input_to_chat_messages,
    convert_response_text_to_chat_response_format,
    extract_guardrail_ids,
@ -69,6 +77,8 @@ class OpenAIResponsesImpl:
        vector_io_api: VectorIO,  # VectorIO
        safety_api: Safety | None,
        conversations_api: Conversations,
+        prompts_api: Prompts,
+        files_api: Files,
    ):
        self.inference_api = inference_api
        self.tool_groups_api = tool_groups_api
@ -82,6 +92,8 @@ class OpenAIResponsesImpl:
            tool_runtime_api=tool_runtime_api,
            vector_io_api=vector_io_api,
        )
+        self.prompts_api = prompts_api
+        self.files_api = files_api

    async def _prepend_previous_response(
        self,
@ -122,11 +134,13 @@ class OpenAIResponsesImpl:
                # Use stored messages directly and convert only new input
                message_adapter = TypeAdapter(list[OpenAIMessageParam])
                messages = message_adapter.validate_python(previous_response.messages)
-                new_messages = await convert_response_input_to_chat_messages(input, previous_messages=messages)
+                new_messages = await convert_response_input_to_chat_messages(
+                    input, previous_messages=messages, files_api=self.files_api
+                )
                messages.extend(new_messages)
            else:
                # Backward compatibility: reconstruct from inputs
-                messages = await convert_response_input_to_chat_messages(all_input)
+                messages = await convert_response_input_to_chat_messages(all_input, files_api=self.files_api)

            tool_context.recover_tools_from_previous_response(previous_response)
        elif conversation is not None:
@ -138,7 +152,7 @@ class OpenAIResponsesImpl:
            all_input = input
            if not conversation_items.data:
                # First turn - just convert the new input
-                messages = await convert_response_input_to_chat_messages(input)
+                messages = await convert_response_input_to_chat_messages(input, files_api=self.files_api)
            else:
                if not stored_messages:
                    all_input = conversation_items.data
@ -154,14 +168,82 @@ class OpenAIResponsesImpl:
                    all_input = input

                messages = stored_messages or []
-                new_messages = await convert_response_input_to_chat_messages(all_input, previous_messages=messages)
+                new_messages = await convert_response_input_to_chat_messages(
+                    all_input, previous_messages=messages, files_api=self.files_api
+                )
                messages.extend(new_messages)
        else:
            all_input = input
-            messages = await convert_response_input_to_chat_messages(all_input)
+            messages = await convert_response_input_to_chat_messages(all_input, files_api=self.files_api)

        return all_input, messages, tool_context

+    async def _prepend_prompt(
+        self,
+        messages: list[OpenAIMessageParam],
+        openai_response_prompt: OpenAIResponsePrompt | None,
+    ) -> None:
+        """Prepend prompt template to messages, resolving text/image/file variables.
+
+        :param messages: List of OpenAIMessageParam objects
+        :param openai_response_prompt: (Optional) OpenAIResponsePrompt object with variables
+        :returns: string of utf-8 characters
+        """
+        if not openai_response_prompt or not openai_response_prompt.id:
+            return
+
+        prompt_version = int(openai_response_prompt.version) if openai_response_prompt.version else None
+        cur_prompt = await self.prompts_api.get_prompt(openai_response_prompt.id, prompt_version)
+
+        if not cur_prompt or not cur_prompt.prompt:
+            return
+
+        cur_prompt_text = cur_prompt.prompt
+        cur_prompt_variables = cur_prompt.variables
+
+        if not openai_response_prompt.variables:
+            messages.insert(0, OpenAISystemMessageParam(content=cur_prompt_text))
+            return
+
+        # Validate that all provided variables exist in the prompt
+        for name in openai_response_prompt.variables.keys():
+            if name not in cur_prompt_variables:
+                raise ValueError(f"Variable {name} not found in prompt {openai_response_prompt.id}")
+
+        # Separate text and media variables
+        text_substitutions = {}
+        media_content_parts: list[OpenAIChatCompletionContentPartParam] = []
+
+        for name, value in openai_response_prompt.variables.items():
+            # Text variable found
+            if isinstance(value, OpenAIResponseInputMessageContentText):
+                text_substitutions[name] = value.text
+
+            # Media variable found
+            elif isinstance(value, OpenAIResponseInputMessageContentImage | OpenAIResponseInputMessageContentFile):
+                converted_parts = await convert_response_content_to_chat_content([value], files_api=self.files_api)
+                if isinstance(converted_parts, list):
+                    media_content_parts.extend(converted_parts)
+
+                # Eg: {{product_photo}} becomes "[Image: product_photo]"
+                # This gives the model textual context about what media exists in the prompt
+                var_type = value.type.replace("input_", "").replace("_", " ").title()
+                text_substitutions[name] = f"[{var_type}: {name}]"
+
+        def replace_variable(match: re.Match[str]) -> str:
+            var_name = match.group(1).strip()
+            return str(text_substitutions.get(var_name, match.group(0)))
+
+        pattern = r"\{\{\s*(\w+)\s*\}\}"
+        processed_prompt_text = re.sub(pattern, replace_variable, cur_prompt_text)
+
+        # Insert system message with resolved text
+        messages.insert(0, OpenAISystemMessageParam(content=processed_prompt_text))
+
+        # If we have media, create a new user message because allows to ingest images and files
+        if media_content_parts:
+            messages.append(OpenAIUserMessageParam(content=media_content_parts))
+
    async def get_openai_response(
        self,
        response_id: str,
@ -297,6 +379,7 @@ class OpenAIResponsesImpl:
            input=input,
            conversation=conversation,
            model=model,
+            prompt=prompt,
            instructions=instructions,
            previous_response_id=previous_response_id,
            store=store,
@ -350,6 +433,7 @@ class OpenAIResponsesImpl:
        instructions: str | None = None,
        previous_response_id: str | None = None,
        conversation: str | None = None,
+        prompt: OpenAIResponsePrompt | None = None,
        store: bool | None = True,
        temperature: float | None = None,
        text: OpenAIResponseText | None = None,
@ -372,6 +456,9 @@ class OpenAIResponsesImpl:
        if instructions:
            messages.insert(0, OpenAISystemMessageParam(content=instructions))

+        # Prepend reusable prompt (if provided)
+        await self._prepend_prompt(messages, prompt)
+
        # Structured outputs
        response_format = await convert_response_text_to_chat_response_format(text)

@ -394,6 +481,7 @@ class OpenAIResponsesImpl:
            ctx=ctx,
            response_id=response_id,
            created_at=created_at,
+            prompt=prompt,
            text=text,
            max_infer_iters=max_infer_iters,
            parallel_tool_calls=parallel_tool_calls,
--- a/src/llama_stack/providers/inline/agents/meta_reference/responses/utils.py
+++ b/src/llama_stack/providers/inline/agents/meta_reference/responses/utils.py
@ -5,11 +5,14 @@
 # the root directory of this source tree.

 import asyncio
+import base64
+import mimetypes
 import re
 import uuid
 from collections.abc import Sequence

 from llama_stack_api import (
+    Files,
    OpenAIAssistantMessageParam,
    OpenAIChatCompletionContentPartImageParam,
    OpenAIChatCompletionContentPartParam,
@ -18,6 +21,8 @@ from llama_stack_api import (
    OpenAIChatCompletionToolCallFunction,
    OpenAIChoice,
    OpenAIDeveloperMessageParam,
+    OpenAIFile,
+    OpenAIFileFile,
    OpenAIImageURL,
    OpenAIJSONSchema,
    OpenAIMessageParam,
@ -29,6 +34,7 @@ from llama_stack_api import (
    OpenAIResponseInput,
    OpenAIResponseInputFunctionToolCallOutput,
    OpenAIResponseInputMessageContent,
+    OpenAIResponseInputMessageContentFile,
    OpenAIResponseInputMessageContentImage,
    OpenAIResponseInputMessageContentText,
    OpenAIResponseInputTool,
@ -37,9 +43,11 @@ from llama_stack_api import (
    OpenAIResponseMessage,
    OpenAIResponseOutputMessageContent,
    OpenAIResponseOutputMessageContentOutputText,
+    OpenAIResponseOutputMessageFileSearchToolCall,
    OpenAIResponseOutputMessageFunctionToolCall,
    OpenAIResponseOutputMessageMCPCall,
    OpenAIResponseOutputMessageMCPListTools,
+    OpenAIResponseOutputMessageWebSearchToolCall,
    OpenAIResponseText,
    OpenAISystemMessageParam,
    OpenAIToolMessageParam,
@ -49,6 +57,46 @@ from llama_stack_api import (
 )


+async def extract_bytes_from_file(file_id: str, files_api: Files) -> bytes:
+    """
+    Extract raw bytes from file using the Files API.
+
+    :param file_id: The file identifier (e.g., "file-abc123")
+    :param files_api: Files API instance
+    :returns: Raw file content as bytes
+    :raises: ValueError if file cannot be retrieved
+    """
+    try:
+        response = await files_api.openai_retrieve_file_content(file_id)
+        return bytes(response.body)
+    except Exception as e:
+        raise ValueError(f"Failed to retrieve file content for file_id '{file_id}': {str(e)}") from e
+
+
+def generate_base64_ascii_text_from_bytes(raw_bytes: bytes) -> str:
+    """
+    Converts raw binary bytes into a safe ASCII text representation for URLs
+
+    :param raw_bytes: the actual bytes that represents file content
+    :returns: string of utf-8 characters
+    """
+    return base64.b64encode(raw_bytes).decode("utf-8")
+
+
+def construct_data_url(ascii_text: str, mime_type: str | None) -> str:
+    """
+    Construct data url with decoded data inside
+
+    :param ascii_text: ASCII content
+    :param mime_type: MIME type of file
+    :returns: data url string (eg. data:image/png,base64,%3Ch1%3EHello%2C%20World%21%3C%2Fh1%3E)
+    """
+    if not mime_type:
+        mime_type = "application/octet-stream"
+
+    return f"data:{mime_type};base64,{ascii_text}"
+
+
 async def convert_chat_choice_to_response_message(
    choice: OpenAIChoice,
    citation_files: dict[str, str] | None = None,
@ -78,11 +126,15 @@ async def convert_chat_choice_to_response_message(

 async def convert_response_content_to_chat_content(
    content: str | Sequence[OpenAIResponseInputMessageContent | OpenAIResponseOutputMessageContent],
+    files_api: Files | None,
 ) -> str | list[OpenAIChatCompletionContentPartParam]:
    """
    Convert the content parts from an OpenAI Response API request into OpenAI Chat Completion content parts.

    The content schemas of each API look similar, but are not exactly the same.
+
+    :param content: The content to convert
+    :param files_api: Files API for resolving file_id to raw file content (required if content contains files/images)
    """
    if isinstance(content, str):
        return content
@ -95,9 +147,68 @@ async def convert_response_content_to_chat_content(
        elif isinstance(content_part, OpenAIResponseOutputMessageContentOutputText):
            converted_parts.append(OpenAIChatCompletionContentPartTextParam(text=content_part.text))
        elif isinstance(content_part, OpenAIResponseInputMessageContentImage):
+            detail = content_part.detail
+            image_mime_type = None
            if content_part.image_url:
-                image_url = OpenAIImageURL(url=content_part.image_url, detail=content_part.detail)
+                image_url = OpenAIImageURL(url=content_part.image_url, detail=detail)
                converted_parts.append(OpenAIChatCompletionContentPartImageParam(image_url=image_url))
+            elif content_part.file_id:
+                if files_api is None:
+                    raise ValueError("file_ids are not supported by this implementation of the Stack")
+                image_file_response = await files_api.openai_retrieve_file(content_part.file_id)
+                if image_file_response.filename:
+                    image_mime_type, _ = mimetypes.guess_type(image_file_response.filename)
+                raw_image_bytes = await extract_bytes_from_file(content_part.file_id, files_api)
+                ascii_text = generate_base64_ascii_text_from_bytes(raw_image_bytes)
+                image_data_url = construct_data_url(ascii_text, image_mime_type)
+                image_url = OpenAIImageURL(url=image_data_url, detail=detail)
+                converted_parts.append(OpenAIChatCompletionContentPartImageParam(image_url=image_url))
+            else:
+                raise ValueError(
+                    f"Image content must have either 'image_url' or 'file_id'. "
+                    f"Got image_url={content_part.image_url}, file_id={content_part.file_id}"
+                )
+        elif isinstance(content_part, OpenAIResponseInputMessageContentFile):
+            resolved_file_data = None
+            file_data = content_part.file_data
+            file_id = content_part.file_id
+            file_url = content_part.file_url
+            filename = content_part.filename
+            file_mime_type = None
+            if not any([file_data, file_id, file_url]):
+                raise ValueError(
+                    f"File content must have at least one of 'file_data', 'file_id', or 'file_url'. "
+                    f"Got file_data={file_data}, file_id={file_id}, file_url={file_url}"
+                )
+            if file_id:
+                if files_api is None:
+                    raise ValueError("file_ids are not supported by this implementation of the Stack")
+
+                file_response = await files_api.openai_retrieve_file(file_id)
+                if not filename:
+                    filename = file_response.filename
+                file_mime_type, _ = mimetypes.guess_type(file_response.filename)
+                raw_file_bytes = await extract_bytes_from_file(file_id, files_api)
+                ascii_text = generate_base64_ascii_text_from_bytes(raw_file_bytes)
+                resolved_file_data = construct_data_url(ascii_text, file_mime_type)
+            elif file_data:
+                if file_data.startswith("data:"):
+                    resolved_file_data = file_data
+                else:
+                    # Raw base64 data, wrap in data URL format
+                    if filename:
+                        file_mime_type, _ = mimetypes.guess_type(filename)
+                    resolved_file_data = construct_data_url(file_data, file_mime_type)
+            elif file_url:
+                resolved_file_data = file_url
+            converted_parts.append(
+                OpenAIFile(
+                    file=OpenAIFileFile(
+                        file_data=resolved_file_data,
+                        filename=filename,
+                    )
+                )
+            )
        elif isinstance(content_part, str):
            converted_parts.append(OpenAIChatCompletionContentPartTextParam(text=content_part))
        else:
@ -110,12 +221,14 @@ async def convert_response_content_to_chat_content(
 async def convert_response_input_to_chat_messages(
    input: str | list[OpenAIResponseInput],
    previous_messages: list[OpenAIMessageParam] | None = None,
+    files_api: Files | None = None,
 ) -> list[OpenAIMessageParam]:
    """
    Convert the input from an OpenAI Response API request into OpenAI Chat Completion messages.

    :param input: The input to convert
    :param previous_messages: Optional previous messages to check for function_call references
+    :param files_api: Files API for resolving file_id to raw file content (optional, required for file/image content)
    """
    messages: list[OpenAIMessageParam] = []
    if isinstance(input, list):
@ -169,6 +282,12 @@ async def convert_response_input_to_chat_messages(
            elif isinstance(input_item, OpenAIResponseOutputMessageMCPListTools):
                # the tool list will be handled separately
                pass
+            elif isinstance(
+                input_item,
+                OpenAIResponseOutputMessageWebSearchToolCall | OpenAIResponseOutputMessageFileSearchToolCall,
+            ):
+                # these tool calls are tracked internally but not converted to chat messages
+                pass
            elif isinstance(input_item, OpenAIResponseMCPApprovalRequest) or isinstance(
                input_item, OpenAIResponseMCPApprovalResponse
            ):
@ -176,7 +295,7 @@ async def convert_response_input_to_chat_messages(
                pass
            elif isinstance(input_item, OpenAIResponseMessage):
                # Narrow type to OpenAIResponseMessage which has content and role attributes
-                content = await convert_response_content_to_chat_content(input_item.content)
+                content = await convert_response_content_to_chat_content(input_item.content, files_api)
                message_type = await get_message_type_by_role(input_item.role)
                if message_type is None:
                    raise ValueError(
--- a/src/llama_stack/providers/registry/agents.py
+++ b/src/llama_stack/providers/registry/agents.py
@ -34,6 +34,8 @@ def available_providers() -> list[ProviderSpec]:
                Api.tool_runtime,
                Api.tool_groups,
                Api.conversations,
+                Api.prompts,
+                Api.files,
            ],
            optional_api_dependencies=[
                Api.safety,