llama-stack-mirror/tests/unit/providers
ehhuang ac7c35fbe6
fix: don't pass default response format in Responses (#3614)
# What does this PR do?
Fireworks doesn't allow repsonse_format with tool use. The default
response format is 'text' anyway, so we can safely omit.


## Test Plan
Below script failed without the change, runs after.

```
#!/usr/bin/env python3
"""
Script to test Responses API with kubernetes-mcp-server.

This script:
1. Connects to the llama stack server
2. Uses the Responses API with MCP tools
3. Asks for the list of Kubernetes namespaces using the kubernetes-mcp-server
"""

import json

from openai import OpenAI

# Connect to the llama stack server
base_url = "http://localhost:8321/v1"
client = OpenAI(base_url=base_url, api_key="fake")

# Define the MCP tool pointing to the kubernetes-mcp-server
# The kubernetes-mcp-server is running on port 3000 with SSE endpoint at /sse
mcp_server_url = "http://localhost:3000/sse"

tools = [
    {
        "type": "mcp",
        "server_label": "k8s",
        "server_url": mcp_server_url,
    }
]

# Create a response request asking for k8s namespaces
print("Sending request to list Kubernetes namespaces...")
print(f"Using MCP server at: {mcp_server_url}")
print("Available tools will be listed automatically by the MCP server.")
print()

response = client.responses.create(
    # model="meta-llama/Llama-3.2-3B-Instruct",  # Using the vllm model
    model="fireworks/accounts/fireworks/models/llama4-scout-instruct-basic",
    # model="openai/gpt-4o",
    input="what are all the Kubernetes namespaces? Use tool call to `namespaces_list`. make sure to adhere to the tool calling format UNDER ALL CIRCUMSTANCES.",
    tools=tools,
    stream=False,
)

print("\n" + "=" * 80)
print("RESPONSE OUTPUT:")
print("=" * 80)

# Print the output
for i, output in enumerate(response.output):
    print(f"\n[Output {i + 1}] Type: {output.type}")
    if output.type == "mcp_list_tools":
        print(f"  Server: {output.server_label}")
        print(f"  Tools available: {[t.name for t in output.tools]}")
    elif output.type == "mcp_call":
        print(f"  Tool called: {output.name}")
        print(f"  Arguments: {output.arguments}")
        print(f"  Result: {output.output}")
        if output.error:
            print(f"  Error: {output.error}")
    elif output.type == "message":
        print(f"  Role: {output.role}")
        print(f"  Content: {output.content}")

print("\n" + "=" * 80)
print("FINAL RESPONSE TEXT:")
print("=" * 80)
print(response.output_text)
```
2025-09-30 14:52:24 -07:00
..
agent fix: adding mime type of application/json support (#3452) 2025-09-29 11:27:31 -07:00
agents fix: don't pass default response format in Responses (#3614) 2025-09-30 14:52:24 -07:00
batches feat(batches, completions): add /v1/completions support to /v1/batches (#3309) 2025-09-05 11:59:57 -07:00
files feat(files): fix expires_after API shape (#3604) 2025-09-29 21:29:15 -07:00
inference feat: add static embedding metadata to dynamic model listings for providers using OpenAIMixin (#3547) 2025-09-25 17:17:00 -04:00
inline fix: mcp tool with array type should include items (#3602) 2025-09-29 23:11:41 -07:00
nvidia feat: add static embedding metadata to dynamic model listings for providers using OpenAIMixin (#3547) 2025-09-25 17:17:00 -04:00
utils fix(expires_after): make sure multipart/form-data is properly parsed (#3612) 2025-09-30 16:14:03 -04:00
vector_io chore(api): remove deprecated embeddings impls (#3301) 2025-09-29 14:45:09 -04:00
test_bedrock.py fix: AWS Bedrock inference profile ID conversion for region-specific endpoints (#3386) 2025-09-11 11:41:53 +02:00
test_configs.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00