llama-stack-mirror/llama_stack/providers/inline
ehhuang ac7c35fbe6
fix: don't pass default response format in Responses (#3614)
# What does this PR do?
Fireworks doesn't allow repsonse_format with tool use. The default
response format is 'text' anyway, so we can safely omit.


## Test Plan
Below script failed without the change, runs after.

```
#!/usr/bin/env python3
"""
Script to test Responses API with kubernetes-mcp-server.

This script:
1. Connects to the llama stack server
2. Uses the Responses API with MCP tools
3. Asks for the list of Kubernetes namespaces using the kubernetes-mcp-server
"""

import json

from openai import OpenAI

# Connect to the llama stack server
base_url = "http://localhost:8321/v1"
client = OpenAI(base_url=base_url, api_key="fake")

# Define the MCP tool pointing to the kubernetes-mcp-server
# The kubernetes-mcp-server is running on port 3000 with SSE endpoint at /sse
mcp_server_url = "http://localhost:3000/sse"

tools = [
    {
        "type": "mcp",
        "server_label": "k8s",
        "server_url": mcp_server_url,
    }
]

# Create a response request asking for k8s namespaces
print("Sending request to list Kubernetes namespaces...")
print(f"Using MCP server at: {mcp_server_url}")
print("Available tools will be listed automatically by the MCP server.")
print()

response = client.responses.create(
    # model="meta-llama/Llama-3.2-3B-Instruct",  # Using the vllm model
    model="fireworks/accounts/fireworks/models/llama4-scout-instruct-basic",
    # model="openai/gpt-4o",
    input="what are all the Kubernetes namespaces? Use tool call to `namespaces_list`. make sure to adhere to the tool calling format UNDER ALL CIRCUMSTANCES.",
    tools=tools,
    stream=False,
)

print("\n" + "=" * 80)
print("RESPONSE OUTPUT:")
print("=" * 80)

# Print the output
for i, output in enumerate(response.output):
    print(f"\n[Output {i + 1}] Type: {output.type}")
    if output.type == "mcp_list_tools":
        print(f"  Server: {output.server_label}")
        print(f"  Tools available: {[t.name for t in output.tools]}")
    elif output.type == "mcp_call":
        print(f"  Tool called: {output.name}")
        print(f"  Arguments: {output.arguments}")
        print(f"  Result: {output.output}")
        if output.error:
            print(f"  Error: {output.error}")
    elif output.type == "message":
        print(f"  Role: {output.role}")
        print(f"  Content: {output.content}")

print("\n" + "=" * 80)
print("FINAL RESPONSE TEXT:")
print("=" * 80)
print(response.output_text)
```
2025-09-30 14:52:24 -07:00
..
agents fix: don't pass default response format in Responses (#3614) 2025-09-30 14:52:24 -07:00
batches feat(batches, completions): add /v1/completions support to /v1/batches (#3309) 2025-09-05 11:59:57 -07:00
datasetio chore(misc): make tests and starter faster (#3042) 2025-08-05 14:55:05 -07:00
eval feat: update eval runner to use openai endpoints (#3588) 2025-09-29 13:13:53 -07:00
files/localfs fix(expires_after): make sure multipart/form-data is properly parsed (#3612) 2025-09-30 16:14:03 -04:00
inference chore(api): remove batch inference (#3261) 2025-09-26 14:35:34 -07:00
ios/inference chore: removed executorch submodule (#1265) 2025-02-25 21:57:21 -08:00
post_training chore(pre-commit): add pre-commit hook to enforce llama_stack logger usage (#3061) 2025-08-20 07:15:35 -04:00
safety feat: use /v1/chat/completions for safety model inference (#3591) 2025-09-30 11:01:44 -07:00
scoring feat: create HTTP DELETE API endpoints to unregister ScoringFn and Benchmark resources in Llama Stack (#3371) 2025-09-15 12:43:38 -07:00
telemetry chore: remove extra logging (#3574) 2025-09-27 11:22:54 -07:00
tool_runtime chore: Updating documentation, adding exception handling for Vector Stores in RAG Tool, more tests on migration, and migrate off of inference_api for context_retriever for RAG (#3367) 2025-09-11 14:20:11 +02:00
vector_io refactor: use generic WeightedInMemoryAggregator for hybrid search in SQLiteVecIndex (#3303) 2025-09-02 10:38:35 -07:00
__init__.py impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00