mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-03 09:53:45 +00:00
feat(responses)!: implement support for OpenAI compatible prompts in Responses API (#3965)
# What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR is responsible for providing actual implementation of OpenAI compatible prompts in Responses API. This is the follow up PR with actual implementation after introducing #3942 The need of this functionality was initiated in #3514. > Note, https://github.com/llamastack/llama-stack/pull/3514 is divided on three separate PRs. Current PR is the third of three. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #3321 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> Manual testing, CI workflow with added unit tests Comprehensive manual testing with new implementation: **Test Prompts with Images with text on them in Responses API:** I used this image for testing purposes: [iphone 17 image](https://github.com/user-attachments/assets/9e2ee821-e394-4bbd-b1c8-d48a3fa315de) 1. Upload an image: ``` curl -X POST http://localhost:8321/v1/files \ -H "Content-Type: multipart/form-data" \ -F "file=@/Users/ianmiller/iphone.jpeg" \ -F "purpose=assistants" ``` `{"object":"file","id":"file-d6d375f238e14f21952cc40246bc8504","bytes":556241,"created_at":1761750049,"expires_at":1793286049,"filename":"iphone.jpeg","purpose":"assistants"}%` 2. Create prompt: ``` curl -X POST http://localhost:8321/v1/prompts \ -H "Content-Type: application/json" \ -d '{ "prompt": "You are a product analysis expert. Analyze the following product:\n\nProduct Name: {{product_name}}\nDescription: {{description}}\n\nImage: {{product_photo}}\n\nProvide a detailed analysis including quality assessment, target audience, and pricing recommendations.", "variables": ["product_name", "description", "product_photo"] }' ``` `{"prompt":"You are a product analysis expert. Analyze the following product:\n\nProduct Name: {{product_name}}\nDescription: {{description}}\n\nImage: {{product_photo}}\n\nProvide a detailed analysis including quality assessment, target audience, and pricing recommendations.","version":1,"prompt_id":"pmpt_7be2208cb82cdbc35356354dae1f335d1e9b7baeca21ea62","variables":["product_name","description","product_photo"],"is_default":false}%` 3. Create response: ``` curl -X POST http://localhost:8321/v1/responses \ -H "Accept: application/json, text/event-stream" \ -H "Content-Type: application/json" \ -d '{ "input": "Please analyze this product", "model": "openai/gpt-4o", "store": true, "prompt": { "id": "pmpt_7be2208cb82cdbc35356354dae1f335d1e9b7baeca21ea62", "version": "1", "variables": { "product_name": { "type": "input_text", "text": "iPhone 17 Pro Max" }, "product_photo": { "type": "input_image", "file_id": "file-d6d375f238e14f21952cc40246bc8504", "detail": "high" } } } }' ``` `{"created_at":1761750427,"error":null,"id":"resp_f897f914-e3b8-4783-8223-3ed0d32fcbc6","model":"openai/gpt-4o","object":"response","output":[{"content":[{"text":"### Product Analysis: iPhone 17 Pro Max\n\n**Quality Assessment:**\n\n- **Display & Design:**\n - The 6.9-inch display is large, ideal for streaming and productivity.\n - Anti-reflective technology and 120Hz refresh rate enhance viewing experience, providing smoother visuals and reducing glare.\n - Titanium frame suggests a premium build, offering durability and a sleek appearance.\n\n- **Performance:**\n - The Apple A19 Pro chip promises significant performance improvements, likely leading to faster processing and efficient multitasking.\n - 12GB RAM is substantial for a smartphone, ensuring smooth operation for demanding apps and games.\n\n- **Camera System:**\n - The triple 48MP camera setup (wide, ultra-wide, telephoto) is designed for versatile photography needs, capturing high-resolution photos and videos.\n - The 24MP front camera will appeal to selfie enthusiasts and content creators needing quality front-facing shots.\n\n- **Connectivity:**\n - Wi-Fi 7 support indicates future-proof wireless capabilities, providing faster and more reliable internet connectivity.\n\n**Target Audience:**\n\n- **Tech Enthusiasts:** Individuals interested in cutting-edge technology and performance.\n- **Content Creators:** Users who need a robust camera system for photo and video production.\n- **Luxury Consumers:** Those who prefer premium materials and top-of-the-line specs.\n- **Professionals:** Users who require efficient multitasking and productivity features.\n\n**Pricing Recommendations:**\n\n- Given the premium specifications, a higher price point is expected. Consider pricing competitively within the high-end smartphone market while justifying cost through unique features like the titanium frame and advanced connectivity options.\n- Positioning around the $1,200 to $1,500 range would align with expectations for top-tier devices, catering to its target audience while ensuring profitability.\n\nOverall, the iPhone 17 Pro Max showcases a blend of innovative features and premium design, aimed at users seeking high performance and superior aesthetics.","type":"output_text","annotations":[]}],"role":"assistant","type":"message","id":"msg_66f4d844-4d9e-4102-80fc-eb75b34b6dbd","status":"completed"}],"parallel_tool_calls":false,"previous_response_id":null,"prompt":{"id":"pmpt_7be2208cb82cdbc35356354dae1f335d1e9b7baeca21ea62","variables":{"product_name":{"text":"iPhone 17 Pro Max","type":"input_text"},"product_photo":{"detail":"high","type":"input_image","file_id":"file-d6d375f238e14f21952cc40246bc8504","image_url":null}},"version":"1"},"status":"completed","temperature":null,"text":{"format":{"type":"text"}},"top_p":null,"tools":[],"truncation":null,"usage":{"input_tokens":830,"output_tokens":394,"total_tokens":1224,"input_tokens_details":{"cached_tokens":0},"output_tokens_details":{"reasoning_tokens":0}},"instructions":null}%` **Test Prompts with PDF files in Responses API:** I used this PDF file for testing purposes: [invoicesample.pdf](https://github.com/user-attachments/files/22958943/invoicesample.pdf) 1. Upload PDF: ``` curl -X POST http://localhost:8321/v1/files \ -H "Content-Type: multipart/form-data" \ -F "file=@/Users/ianmiller/invoicesample.pdf" \ -F "purpose=assistants" ``` `{"object":"file","id":"file-7fbb1043a4bb468cab60ffe4b8631d8e","bytes":149568,"created_at":1761750730,"expires_at":1793286730,"filename":"invoicesample.pdf","purpose":"assistants"}%` 2. Create prompt: ``` curl -X POST http://localhost:8321/v1/prompts \ -H "Content-Type: application/json" \ -d '{ "prompt": "You are an accounting and financial analysis expert. Analyze the following invoice document:\n\nInvoice Document: {{invoice_doc}}\n\nProvide a comprehensive analysis", "variables": ["invoice_doc"] }' ``` `{"prompt":"You are an accounting and financial analysis expert. Analyze the following invoice document:\n\nInvoice Document: {{invoice_doc}}\n\nProvide a comprehensive analysis","version":1,"prompt_id":"pmpt_72e2a184a86f32a568b6afb5455dca5c16bf3cc3f80092dc","variables":["invoice_doc"],"is_default":false}%` 3. Create response: ``` curl -X POST http://localhost:8321/v1/responses \ -H "Content-Type: application/json" \ -d '{ "input": "Please provide a detailed analysis of this invoice", "model": "openai/gpt-4o", "store": true, "prompt": { "id": "pmpt_72e2a184a86f32a568b6afb5455dca5c16bf3cc3f80092dc", "version": "1", "variables": { "invoice_doc": { "type": "input_file", "file_id": "file-7fbb1043a4bb468cab60ffe4b8631d8e", "filename": "invoicesample.pdf" } } } }' ``` `{"created_at":1761750881,"error":null,"id":"resp_da866913-db06-4702-8000-174daed9dbbb","model":"openai/gpt-4o","object":"response","output":[{"content":[{"text":"Here's a detailed analysis of the invoice provided:\n\n### Seller Information\n- **Business Name:** The invoice features a logo with \"Sunny Farm\" indicating the business identity.\n- **Address:** 123 Somewhere St, Melbourne VIC 3000\n- **Contact Information:** Phone number (03) 1234 5678\n\n### Buyer Information\n- **Name:** Denny Gunawan\n- **Address:** 221 Queen St, Melbourne VIC 3000\n\n### Transaction Details\n- **Invoice Number:** #20130304\n- **Date of Transaction:** Not explicitly mentioned, likely inferred from the invoice number or needs clarification.\n\n### Items Purchased\n1. **Apple**\n - Price: $5.00/kg\n - Quantity: 1 kg\n - Subtotal: $5.00\n\n2. **Orange**\n - Price: $1.99/kg\n - Quantity: 2 kg\n - Subtotal: $3.98\n\n3. **Watermelon**\n - Price: $1.69/kg\n - Quantity: 3 kg\n - Subtotal: $5.07\n\n4. **Mango**\n - Price: $9.56/kg\n - Quantity: 2 kg\n - Subtotal: $19.12\n\n5. **Peach**\n - Price: $2.99/kg\n - Quantity: 1 kg\n - Subtotal: $2.99\n\n### Financial Summary\n- **Subtotal for Items:** $36.00\n- **GST (Goods and Services Tax):** 10% of $36.00, which amounts to $3.60\n- **Total Amount Due:** $39.60\n\n### Notes\n- The invoice includes a placeholder text: \"Lorem ipsum dolor sit amet...\" which is typically used as filler text. This might indicate a section intended for terms, conditions, or additional notes that haven’t been completed.\n\n### Visual and Design Elements\n- The invoice uses a simple and clear layout, featuring the business logo prominently and stating essential information such as contact and transaction details in a structured manner.\n- There is a \"Thank You\" note at the bottom, which adds a professional and courteous touch.\n\n### Considerations\n- Ensure the date of the transaction is clear if there are any future references needed.\n- Replace filler text with relevant terms and conditions or any special instructions pertaining to the transaction.\n\nThis invoice appears standard, representing a small business transaction with clearly itemized products and applicable taxes.","type":"output_text","annotations":[]}],"role":"assistant","type":"message","id":"msg_39f3b39e-4684-4444-8e4d-e7395f88c9dc","status":"completed"}],"parallel_tool_calls":false,"previous_response_id":null,"prompt":{"id":"pmpt_72e2a184a86f32a568b6afb5455dca5c16bf3cc3f80092dc","variables":{"invoice_doc":{"type":"input_file","file_data":null,"file_id":"file-7fbb1043a4bb468cab60ffe4b8631d8e","file_url":null,"filename":"invoicesample.pdf"}},"version":"1"},"status":"completed","temperature":null,"text":{"format":{"type":"text"}},"top_p":null,"tools":[],"truncation":null,"usage":{"input_tokens":529,"output_tokens":513,"total_tokens":1042,"input_tokens_details":{"cached_tokens":0},"output_tokens_details":{"reasoning_tokens":0}},"instructions":null}%` **Test simple text Prompt in Responses API:** 1. Create prompt: ``` curl -X POST http://localhost:8321/v1/prompts \ -H "Content-Type: application/json" \ -d '{ "prompt": "Hello {{name}}! You are working at {{company}}. Your role is {{role}} at {{company}}. Remember, {{name}}, to be {{tone}}.", "variables": ["name", "company", "role", "tone"] }' ``` `{"prompt":"Hello {{name}}! You are working at {{company}}. Your role is {{role}} at {{company}}. Remember, {{name}}, to be {{tone}}.","version":1,"prompt_id":"pmpt_f340a3164a4f65d975c774ffe38ea42d15e7ce4a835919ef","variables":["name","company","role","tone"],"is_default":false}%` 2. Create response: ``` curl -X POST http://localhost:8321/v1/responses \ -H "Accept: application/json, text/event-stream" \ -H "Content-Type: application/json" \ -d '{ "input": "What is the capital of Ireland?", "model": "openai/gpt-4o", "store": true, "prompt": { "id": "pmpt_f340a3164a4f65d975c774ffe38ea42d15e7ce4a835919ef", "version": "1", "variables": { "name": { "type": "input_text", "text": "Alice" }, "company": { "type": "input_text", "text": "Dummy Company" }, "role": { "type": "input_text", "text": "Geography expert" }, "tone": { "type": "input_text", "text": "professional and helpful" } } } }' ``` `{"created_at":1761751097,"error":null,"id":"resp_1b037b95-d9ae-4ad0-8e76-d953897ecaef","model":"openai/gpt-4o","object":"response","output":[{"content":[{"text":"The capital of Ireland is Dublin.","type":"output_text","annotations":[]}],"role":"assistant","type":"message","id":"msg_8e7c72b6-2aa2-4da6-8e57-da4e12fa3ce2","status":"completed"}],"parallel_tool_calls":false,"previous_response_id":null,"prompt":{"id":"pmpt_f340a3164a4f65d975c774ffe38ea42d15e7ce4a835919ef","variables":{"name":{"text":"Alice","type":"input_text"},"company":{"text":"Dummy Company","type":"input_text"},"role":{"text":"Geography expert","type":"input_text"},"tone":{"text":"professional and helpful","type":"input_text"}},"version":"1"},"status":"completed","temperature":null,"text":{"format":{"type":"text"}},"top_p":null,"tools":[],"truncation":null,"usage":{"input_tokens":47,"output_tokens":7,"total_tokens":54,"input_tokens_details":{"cached_tokens":0},"output_tokens_details":{"reasoning_tokens":0}},"instructions":null}%`
This commit is contained in:
parent
8852666982
commit
0757d5a917
10 changed files with 770 additions and 17 deletions
|
|
@ -27,8 +27,10 @@ async def get_provider_impl(
|
|||
deps[Api.tool_runtime],
|
||||
deps[Api.tool_groups],
|
||||
deps[Api.conversations],
|
||||
policy,
|
||||
deps[Api.prompts],
|
||||
deps[Api.files],
|
||||
telemetry_enabled,
|
||||
policy,
|
||||
)
|
||||
await impl.initialize()
|
||||
return impl
|
||||
|
|
|
|||
|
|
@ -12,6 +12,7 @@ from llama_stack.providers.utils.responses.responses_store import ResponsesStore
|
|||
from llama_stack_api import (
|
||||
Agents,
|
||||
Conversations,
|
||||
Files,
|
||||
Inference,
|
||||
ListOpenAIResponseInputItem,
|
||||
ListOpenAIResponseObject,
|
||||
|
|
@ -22,6 +23,7 @@ from llama_stack_api import (
|
|||
OpenAIResponsePrompt,
|
||||
OpenAIResponseText,
|
||||
Order,
|
||||
Prompts,
|
||||
ResponseGuardrail,
|
||||
Safety,
|
||||
ToolGroups,
|
||||
|
|
@ -45,6 +47,8 @@ class MetaReferenceAgentsImpl(Agents):
|
|||
tool_runtime_api: ToolRuntime,
|
||||
tool_groups_api: ToolGroups,
|
||||
conversations_api: Conversations,
|
||||
prompts_api: Prompts,
|
||||
files_api: Files,
|
||||
policy: list[AccessRule],
|
||||
telemetry_enabled: bool = False,
|
||||
):
|
||||
|
|
@ -56,7 +60,8 @@ class MetaReferenceAgentsImpl(Agents):
|
|||
self.tool_groups_api = tool_groups_api
|
||||
self.conversations_api = conversations_api
|
||||
self.telemetry_enabled = telemetry_enabled
|
||||
|
||||
self.prompts_api = prompts_api
|
||||
self.files_api = files_api
|
||||
self.in_memory_store = InmemoryKVStoreImpl()
|
||||
self.openai_responses_impl: OpenAIResponsesImpl | None = None
|
||||
self.policy = policy
|
||||
|
|
@ -73,6 +78,8 @@ class MetaReferenceAgentsImpl(Agents):
|
|||
vector_io_api=self.vector_io_api,
|
||||
safety_api=self.safety_api,
|
||||
conversations_api=self.conversations_api,
|
||||
prompts_api=self.prompts_api,
|
||||
files_api=self.files_api,
|
||||
)
|
||||
|
||||
async def shutdown(self) -> None:
|
||||
|
|
|
|||
|
|
@ -4,6 +4,7 @@
|
|||
# This source code is licensed under the terms described in the LICENSE file in
|
||||
# the root directory of this source tree.
|
||||
|
||||
import re
|
||||
import time
|
||||
import uuid
|
||||
from collections.abc import AsyncIterator
|
||||
|
|
@ -18,13 +19,17 @@ from llama_stack.providers.utils.responses.responses_store import (
|
|||
from llama_stack_api import (
|
||||
ConversationItem,
|
||||
Conversations,
|
||||
Files,
|
||||
Inference,
|
||||
InvalidConversationIdError,
|
||||
ListOpenAIResponseInputItem,
|
||||
ListOpenAIResponseObject,
|
||||
OpenAIChatCompletionContentPartParam,
|
||||
OpenAIDeleteResponseObject,
|
||||
OpenAIMessageParam,
|
||||
OpenAIResponseInput,
|
||||
OpenAIResponseInputMessageContentFile,
|
||||
OpenAIResponseInputMessageContentImage,
|
||||
OpenAIResponseInputMessageContentText,
|
||||
OpenAIResponseInputTool,
|
||||
OpenAIResponseMessage,
|
||||
|
|
@ -34,7 +39,9 @@ from llama_stack_api import (
|
|||
OpenAIResponseText,
|
||||
OpenAIResponseTextFormat,
|
||||
OpenAISystemMessageParam,
|
||||
OpenAIUserMessageParam,
|
||||
Order,
|
||||
Prompts,
|
||||
ResponseGuardrailSpec,
|
||||
Safety,
|
||||
ToolGroups,
|
||||
|
|
@ -46,6 +53,7 @@ from .streaming import StreamingResponseOrchestrator
|
|||
from .tool_executor import ToolExecutor
|
||||
from .types import ChatCompletionContext, ToolContext
|
||||
from .utils import (
|
||||
convert_response_content_to_chat_content,
|
||||
convert_response_input_to_chat_messages,
|
||||
convert_response_text_to_chat_response_format,
|
||||
extract_guardrail_ids,
|
||||
|
|
@ -69,6 +77,8 @@ class OpenAIResponsesImpl:
|
|||
vector_io_api: VectorIO, # VectorIO
|
||||
safety_api: Safety | None,
|
||||
conversations_api: Conversations,
|
||||
prompts_api: Prompts,
|
||||
files_api: Files,
|
||||
):
|
||||
self.inference_api = inference_api
|
||||
self.tool_groups_api = tool_groups_api
|
||||
|
|
@ -82,6 +92,8 @@ class OpenAIResponsesImpl:
|
|||
tool_runtime_api=tool_runtime_api,
|
||||
vector_io_api=vector_io_api,
|
||||
)
|
||||
self.prompts_api = prompts_api
|
||||
self.files_api = files_api
|
||||
|
||||
async def _prepend_previous_response(
|
||||
self,
|
||||
|
|
@ -122,11 +134,13 @@ class OpenAIResponsesImpl:
|
|||
# Use stored messages directly and convert only new input
|
||||
message_adapter = TypeAdapter(list[OpenAIMessageParam])
|
||||
messages = message_adapter.validate_python(previous_response.messages)
|
||||
new_messages = await convert_response_input_to_chat_messages(input, previous_messages=messages)
|
||||
new_messages = await convert_response_input_to_chat_messages(
|
||||
input, previous_messages=messages, files_api=self.files_api
|
||||
)
|
||||
messages.extend(new_messages)
|
||||
else:
|
||||
# Backward compatibility: reconstruct from inputs
|
||||
messages = await convert_response_input_to_chat_messages(all_input)
|
||||
messages = await convert_response_input_to_chat_messages(all_input, files_api=self.files_api)
|
||||
|
||||
tool_context.recover_tools_from_previous_response(previous_response)
|
||||
elif conversation is not None:
|
||||
|
|
@ -138,7 +152,7 @@ class OpenAIResponsesImpl:
|
|||
all_input = input
|
||||
if not conversation_items.data:
|
||||
# First turn - just convert the new input
|
||||
messages = await convert_response_input_to_chat_messages(input)
|
||||
messages = await convert_response_input_to_chat_messages(input, files_api=self.files_api)
|
||||
else:
|
||||
if not stored_messages:
|
||||
all_input = conversation_items.data
|
||||
|
|
@ -154,14 +168,82 @@ class OpenAIResponsesImpl:
|
|||
all_input = input
|
||||
|
||||
messages = stored_messages or []
|
||||
new_messages = await convert_response_input_to_chat_messages(all_input, previous_messages=messages)
|
||||
new_messages = await convert_response_input_to_chat_messages(
|
||||
all_input, previous_messages=messages, files_api=self.files_api
|
||||
)
|
||||
messages.extend(new_messages)
|
||||
else:
|
||||
all_input = input
|
||||
messages = await convert_response_input_to_chat_messages(all_input)
|
||||
messages = await convert_response_input_to_chat_messages(all_input, files_api=self.files_api)
|
||||
|
||||
return all_input, messages, tool_context
|
||||
|
||||
async def _prepend_prompt(
|
||||
self,
|
||||
messages: list[OpenAIMessageParam],
|
||||
openai_response_prompt: OpenAIResponsePrompt | None,
|
||||
) -> None:
|
||||
"""Prepend prompt template to messages, resolving text/image/file variables.
|
||||
|
||||
:param messages: List of OpenAIMessageParam objects
|
||||
:param openai_response_prompt: (Optional) OpenAIResponsePrompt object with variables
|
||||
:returns: string of utf-8 characters
|
||||
"""
|
||||
if not openai_response_prompt or not openai_response_prompt.id:
|
||||
return
|
||||
|
||||
prompt_version = int(openai_response_prompt.version) if openai_response_prompt.version else None
|
||||
cur_prompt = await self.prompts_api.get_prompt(openai_response_prompt.id, prompt_version)
|
||||
|
||||
if not cur_prompt or not cur_prompt.prompt:
|
||||
return
|
||||
|
||||
cur_prompt_text = cur_prompt.prompt
|
||||
cur_prompt_variables = cur_prompt.variables
|
||||
|
||||
if not openai_response_prompt.variables:
|
||||
messages.insert(0, OpenAISystemMessageParam(content=cur_prompt_text))
|
||||
return
|
||||
|
||||
# Validate that all provided variables exist in the prompt
|
||||
for name in openai_response_prompt.variables.keys():
|
||||
if name not in cur_prompt_variables:
|
||||
raise ValueError(f"Variable {name} not found in prompt {openai_response_prompt.id}")
|
||||
|
||||
# Separate text and media variables
|
||||
text_substitutions = {}
|
||||
media_content_parts: list[OpenAIChatCompletionContentPartParam] = []
|
||||
|
||||
for name, value in openai_response_prompt.variables.items():
|
||||
# Text variable found
|
||||
if isinstance(value, OpenAIResponseInputMessageContentText):
|
||||
text_substitutions[name] = value.text
|
||||
|
||||
# Media variable found
|
||||
elif isinstance(value, OpenAIResponseInputMessageContentImage | OpenAIResponseInputMessageContentFile):
|
||||
converted_parts = await convert_response_content_to_chat_content([value], files_api=self.files_api)
|
||||
if isinstance(converted_parts, list):
|
||||
media_content_parts.extend(converted_parts)
|
||||
|
||||
# Eg: {{product_photo}} becomes "[Image: product_photo]"
|
||||
# This gives the model textual context about what media exists in the prompt
|
||||
var_type = value.type.replace("input_", "").replace("_", " ").title()
|
||||
text_substitutions[name] = f"[{var_type}: {name}]"
|
||||
|
||||
def replace_variable(match: re.Match[str]) -> str:
|
||||
var_name = match.group(1).strip()
|
||||
return str(text_substitutions.get(var_name, match.group(0)))
|
||||
|
||||
pattern = r"\{\{\s*(\w+)\s*\}\}"
|
||||
processed_prompt_text = re.sub(pattern, replace_variable, cur_prompt_text)
|
||||
|
||||
# Insert system message with resolved text
|
||||
messages.insert(0, OpenAISystemMessageParam(content=processed_prompt_text))
|
||||
|
||||
# If we have media, create a new user message because allows to ingest images and files
|
||||
if media_content_parts:
|
||||
messages.append(OpenAIUserMessageParam(content=media_content_parts))
|
||||
|
||||
async def get_openai_response(
|
||||
self,
|
||||
response_id: str,
|
||||
|
|
@ -297,6 +379,7 @@ class OpenAIResponsesImpl:
|
|||
input=input,
|
||||
conversation=conversation,
|
||||
model=model,
|
||||
prompt=prompt,
|
||||
instructions=instructions,
|
||||
previous_response_id=previous_response_id,
|
||||
store=store,
|
||||
|
|
@ -350,6 +433,7 @@ class OpenAIResponsesImpl:
|
|||
instructions: str | None = None,
|
||||
previous_response_id: str | None = None,
|
||||
conversation: str | None = None,
|
||||
prompt: OpenAIResponsePrompt | None = None,
|
||||
store: bool | None = True,
|
||||
temperature: float | None = None,
|
||||
text: OpenAIResponseText | None = None,
|
||||
|
|
@ -372,6 +456,9 @@ class OpenAIResponsesImpl:
|
|||
if instructions:
|
||||
messages.insert(0, OpenAISystemMessageParam(content=instructions))
|
||||
|
||||
# Prepend reusable prompt (if provided)
|
||||
await self._prepend_prompt(messages, prompt)
|
||||
|
||||
# Structured outputs
|
||||
response_format = await convert_response_text_to_chat_response_format(text)
|
||||
|
||||
|
|
@ -394,6 +481,7 @@ class OpenAIResponsesImpl:
|
|||
ctx=ctx,
|
||||
response_id=response_id,
|
||||
created_at=created_at,
|
||||
prompt=prompt,
|
||||
text=text,
|
||||
max_infer_iters=max_infer_iters,
|
||||
parallel_tool_calls=parallel_tool_calls,
|
||||
|
|
|
|||
|
|
@ -5,11 +5,14 @@
|
|||
# the root directory of this source tree.
|
||||
|
||||
import asyncio
|
||||
import base64
|
||||
import mimetypes
|
||||
import re
|
||||
import uuid
|
||||
from collections.abc import Sequence
|
||||
|
||||
from llama_stack_api import (
|
||||
Files,
|
||||
OpenAIAssistantMessageParam,
|
||||
OpenAIChatCompletionContentPartImageParam,
|
||||
OpenAIChatCompletionContentPartParam,
|
||||
|
|
@ -18,6 +21,8 @@ from llama_stack_api import (
|
|||
OpenAIChatCompletionToolCallFunction,
|
||||
OpenAIChoice,
|
||||
OpenAIDeveloperMessageParam,
|
||||
OpenAIFile,
|
||||
OpenAIFileFile,
|
||||
OpenAIImageURL,
|
||||
OpenAIJSONSchema,
|
||||
OpenAIMessageParam,
|
||||
|
|
@ -29,6 +34,7 @@ from llama_stack_api import (
|
|||
OpenAIResponseInput,
|
||||
OpenAIResponseInputFunctionToolCallOutput,
|
||||
OpenAIResponseInputMessageContent,
|
||||
OpenAIResponseInputMessageContentFile,
|
||||
OpenAIResponseInputMessageContentImage,
|
||||
OpenAIResponseInputMessageContentText,
|
||||
OpenAIResponseInputTool,
|
||||
|
|
@ -37,9 +43,11 @@ from llama_stack_api import (
|
|||
OpenAIResponseMessage,
|
||||
OpenAIResponseOutputMessageContent,
|
||||
OpenAIResponseOutputMessageContentOutputText,
|
||||
OpenAIResponseOutputMessageFileSearchToolCall,
|
||||
OpenAIResponseOutputMessageFunctionToolCall,
|
||||
OpenAIResponseOutputMessageMCPCall,
|
||||
OpenAIResponseOutputMessageMCPListTools,
|
||||
OpenAIResponseOutputMessageWebSearchToolCall,
|
||||
OpenAIResponseText,
|
||||
OpenAISystemMessageParam,
|
||||
OpenAIToolMessageParam,
|
||||
|
|
@ -49,6 +57,46 @@ from llama_stack_api import (
|
|||
)
|
||||
|
||||
|
||||
async def extract_bytes_from_file(file_id: str, files_api: Files) -> bytes:
|
||||
"""
|
||||
Extract raw bytes from file using the Files API.
|
||||
|
||||
:param file_id: The file identifier (e.g., "file-abc123")
|
||||
:param files_api: Files API instance
|
||||
:returns: Raw file content as bytes
|
||||
:raises: ValueError if file cannot be retrieved
|
||||
"""
|
||||
try:
|
||||
response = await files_api.openai_retrieve_file_content(file_id)
|
||||
return bytes(response.body)
|
||||
except Exception as e:
|
||||
raise ValueError(f"Failed to retrieve file content for file_id '{file_id}': {str(e)}") from e
|
||||
|
||||
|
||||
def generate_base64_ascii_text_from_bytes(raw_bytes: bytes) -> str:
|
||||
"""
|
||||
Converts raw binary bytes into a safe ASCII text representation for URLs
|
||||
|
||||
:param raw_bytes: the actual bytes that represents file content
|
||||
:returns: string of utf-8 characters
|
||||
"""
|
||||
return base64.b64encode(raw_bytes).decode("utf-8")
|
||||
|
||||
|
||||
def construct_data_url(ascii_text: str, mime_type: str | None) -> str:
|
||||
"""
|
||||
Construct data url with decoded data inside
|
||||
|
||||
:param ascii_text: ASCII content
|
||||
:param mime_type: MIME type of file
|
||||
:returns: data url string (eg. data:image/png,base64,%3Ch1%3EHello%2C%20World%21%3C%2Fh1%3E)
|
||||
"""
|
||||
if not mime_type:
|
||||
mime_type = "application/octet-stream"
|
||||
|
||||
return f"data:{mime_type};base64,{ascii_text}"
|
||||
|
||||
|
||||
async def convert_chat_choice_to_response_message(
|
||||
choice: OpenAIChoice,
|
||||
citation_files: dict[str, str] | None = None,
|
||||
|
|
@ -78,11 +126,15 @@ async def convert_chat_choice_to_response_message(
|
|||
|
||||
async def convert_response_content_to_chat_content(
|
||||
content: str | Sequence[OpenAIResponseInputMessageContent | OpenAIResponseOutputMessageContent],
|
||||
files_api: Files | None,
|
||||
) -> str | list[OpenAIChatCompletionContentPartParam]:
|
||||
"""
|
||||
Convert the content parts from an OpenAI Response API request into OpenAI Chat Completion content parts.
|
||||
|
||||
The content schemas of each API look similar, but are not exactly the same.
|
||||
|
||||
:param content: The content to convert
|
||||
:param files_api: Files API for resolving file_id to raw file content (required if content contains files/images)
|
||||
"""
|
||||
if isinstance(content, str):
|
||||
return content
|
||||
|
|
@ -95,9 +147,68 @@ async def convert_response_content_to_chat_content(
|
|||
elif isinstance(content_part, OpenAIResponseOutputMessageContentOutputText):
|
||||
converted_parts.append(OpenAIChatCompletionContentPartTextParam(text=content_part.text))
|
||||
elif isinstance(content_part, OpenAIResponseInputMessageContentImage):
|
||||
detail = content_part.detail
|
||||
image_mime_type = None
|
||||
if content_part.image_url:
|
||||
image_url = OpenAIImageURL(url=content_part.image_url, detail=content_part.detail)
|
||||
image_url = OpenAIImageURL(url=content_part.image_url, detail=detail)
|
||||
converted_parts.append(OpenAIChatCompletionContentPartImageParam(image_url=image_url))
|
||||
elif content_part.file_id:
|
||||
if files_api is None:
|
||||
raise ValueError("file_ids are not supported by this implementation of the Stack")
|
||||
image_file_response = await files_api.openai_retrieve_file(content_part.file_id)
|
||||
if image_file_response.filename:
|
||||
image_mime_type, _ = mimetypes.guess_type(image_file_response.filename)
|
||||
raw_image_bytes = await extract_bytes_from_file(content_part.file_id, files_api)
|
||||
ascii_text = generate_base64_ascii_text_from_bytes(raw_image_bytes)
|
||||
image_data_url = construct_data_url(ascii_text, image_mime_type)
|
||||
image_url = OpenAIImageURL(url=image_data_url, detail=detail)
|
||||
converted_parts.append(OpenAIChatCompletionContentPartImageParam(image_url=image_url))
|
||||
else:
|
||||
raise ValueError(
|
||||
f"Image content must have either 'image_url' or 'file_id'. "
|
||||
f"Got image_url={content_part.image_url}, file_id={content_part.file_id}"
|
||||
)
|
||||
elif isinstance(content_part, OpenAIResponseInputMessageContentFile):
|
||||
resolved_file_data = None
|
||||
file_data = content_part.file_data
|
||||
file_id = content_part.file_id
|
||||
file_url = content_part.file_url
|
||||
filename = content_part.filename
|
||||
file_mime_type = None
|
||||
if not any([file_data, file_id, file_url]):
|
||||
raise ValueError(
|
||||
f"File content must have at least one of 'file_data', 'file_id', or 'file_url'. "
|
||||
f"Got file_data={file_data}, file_id={file_id}, file_url={file_url}"
|
||||
)
|
||||
if file_id:
|
||||
if files_api is None:
|
||||
raise ValueError("file_ids are not supported by this implementation of the Stack")
|
||||
|
||||
file_response = await files_api.openai_retrieve_file(file_id)
|
||||
if not filename:
|
||||
filename = file_response.filename
|
||||
file_mime_type, _ = mimetypes.guess_type(file_response.filename)
|
||||
raw_file_bytes = await extract_bytes_from_file(file_id, files_api)
|
||||
ascii_text = generate_base64_ascii_text_from_bytes(raw_file_bytes)
|
||||
resolved_file_data = construct_data_url(ascii_text, file_mime_type)
|
||||
elif file_data:
|
||||
if file_data.startswith("data:"):
|
||||
resolved_file_data = file_data
|
||||
else:
|
||||
# Raw base64 data, wrap in data URL format
|
||||
if filename:
|
||||
file_mime_type, _ = mimetypes.guess_type(filename)
|
||||
resolved_file_data = construct_data_url(file_data, file_mime_type)
|
||||
elif file_url:
|
||||
resolved_file_data = file_url
|
||||
converted_parts.append(
|
||||
OpenAIFile(
|
||||
file=OpenAIFileFile(
|
||||
file_data=resolved_file_data,
|
||||
filename=filename,
|
||||
)
|
||||
)
|
||||
)
|
||||
elif isinstance(content_part, str):
|
||||
converted_parts.append(OpenAIChatCompletionContentPartTextParam(text=content_part))
|
||||
else:
|
||||
|
|
@ -110,12 +221,14 @@ async def convert_response_content_to_chat_content(
|
|||
async def convert_response_input_to_chat_messages(
|
||||
input: str | list[OpenAIResponseInput],
|
||||
previous_messages: list[OpenAIMessageParam] | None = None,
|
||||
files_api: Files | None = None,
|
||||
) -> list[OpenAIMessageParam]:
|
||||
"""
|
||||
Convert the input from an OpenAI Response API request into OpenAI Chat Completion messages.
|
||||
|
||||
:param input: The input to convert
|
||||
:param previous_messages: Optional previous messages to check for function_call references
|
||||
:param files_api: Files API for resolving file_id to raw file content (optional, required for file/image content)
|
||||
"""
|
||||
messages: list[OpenAIMessageParam] = []
|
||||
if isinstance(input, list):
|
||||
|
|
@ -169,6 +282,12 @@ async def convert_response_input_to_chat_messages(
|
|||
elif isinstance(input_item, OpenAIResponseOutputMessageMCPListTools):
|
||||
# the tool list will be handled separately
|
||||
pass
|
||||
elif isinstance(
|
||||
input_item,
|
||||
OpenAIResponseOutputMessageWebSearchToolCall | OpenAIResponseOutputMessageFileSearchToolCall,
|
||||
):
|
||||
# these tool calls are tracked internally but not converted to chat messages
|
||||
pass
|
||||
elif isinstance(input_item, OpenAIResponseMCPApprovalRequest) or isinstance(
|
||||
input_item, OpenAIResponseMCPApprovalResponse
|
||||
):
|
||||
|
|
@ -176,7 +295,7 @@ async def convert_response_input_to_chat_messages(
|
|||
pass
|
||||
elif isinstance(input_item, OpenAIResponseMessage):
|
||||
# Narrow type to OpenAIResponseMessage which has content and role attributes
|
||||
content = await convert_response_content_to_chat_content(input_item.content)
|
||||
content = await convert_response_content_to_chat_content(input_item.content, files_api)
|
||||
message_type = await get_message_type_by_role(input_item.role)
|
||||
if message_type is None:
|
||||
raise ValueError(
|
||||
|
|
|
|||
|
|
@ -34,6 +34,8 @@ def available_providers() -> list[ProviderSpec]:
|
|||
Api.tool_runtime,
|
||||
Api.tool_groups,
|
||||
Api.conversations,
|
||||
Api.prompts,
|
||||
Api.files,
|
||||
],
|
||||
optional_api_dependencies=[
|
||||
Api.safety,
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue