This is a sweeping change to clean up some gunk around our "Tool"
definitions.
First, we had two types `Tool` and `ToolDef`. The first of these was a
"Resource" type for the registry but we had stopped registering tools
inside the Registry long back (and only registered ToolGroups.) The
latter was for specifying tools for the Agents API. This PR removes the
former and adds an optional `toolgroup_id` field to the latter.
Secondly, as pointed out by @bbrowning in
https://github.com/llamastack/llama-stack/pull/3003#issuecomment-3245270132,
we were doing a lossy conversion from a full JSON schema from the MCP
tool specification into our ToolDefinition to send it to the model.
There is no necessity to do this -- we ourselves aren't doing any
execution at all but merely passing it to the chat completions API which
supports this. By doing this (and by doing it poorly), we encountered
limitations like not supporting array items, or not resolving $refs,
etc.
To fix this, we replaced the `parameters` field by `{ input_schema,
output_schema }` which can be full blown JSON schemas.
Finally, there were some types in our llama-related chat format
conversion which needed some cleanup. We are taking this opportunity to
clean those up.
This PR is a substantial breaking change to the API. However, given our
window for introducing breaking changes, this suits us just fine. I will
be landing a concurrent `llama-stack-client` change as well since API
shapes are changing.
# What does this PR do?
Fixes error:
```
[ERROR] Error executing endpoint route='/v1/openai/v1/responses'
method='post': Error code: 400 - {'error': {'message': "Invalid schema for function 'pods_exec': In context=('properties', 'command'), array
schema missing items.", 'type': 'invalid_request_error', 'param': 'tools[7].function.parameters', 'code': 'invalid_function_parameters'}}
```
From script:
```
#!/usr/bin/env python3
"""
Script to test Responses API with kubernetes-mcp-server.
This script:
1. Connects to the llama stack server
2. Uses the Responses API with MCP tools
3. Asks for the list of Kubernetes namespaces using the kubernetes-mcp-server
"""
import json
from openai import OpenAI
# Connect to the llama stack server
base_url = "http://localhost:8321/v1/openai/v1"
client = OpenAI(base_url=base_url, api_key="fake")
# Define the MCP tool pointing to the kubernetes-mcp-server
# The kubernetes-mcp-server is running on port 3000 with SSE endpoint at /sse
mcp_server_url = "http://localhost:3000/sse"
tools = [
{
"type": "mcp",
"server_label": "k8s",
"server_url": mcp_server_url,
}
]
# Create a response request asking for k8s namespaces
print("Sending request to list Kubernetes namespaces...")
print(f"Using MCP server at: {mcp_server_url}")
print("Available tools will be listed automatically by the MCP server.")
print()
response = client.responses.create(
# model="meta-llama/Llama-3.2-3B-Instruct", # Using the vllm model
model="openai/gpt-4o",
input="what are all the Kubernetes namespaces? Use tool call to `namespaces_list`. make sure to adhere to the tool calling format.",
tools=tools,
stream=False,
)
print("\n" + "=" * 80)
print("RESPONSE OUTPUT:")
print("=" * 80)
# Print the output
for i, output in enumerate(response.output):
print(f"\n[Output {i + 1}] Type: {output.type}")
if output.type == "mcp_list_tools":
print(f" Server: {output.server_label}")
print(f" Tools available: {[t.name for t in output.tools]}")
elif output.type == "mcp_call":
print(f" Tool called: {output.name}")
print(f" Arguments: {output.arguments}")
print(f" Result: {output.output}")
if output.error:
print(f" Error: {output.error}")
elif output.type == "message":
print(f" Role: {output.role}")
print(f" Content: {output.content}")
print("\n" + "=" * 80)
print("FINAL RESPONSE TEXT:")
print("=" * 80)
print(response.output_text)
```
## Test Plan
new unit tests
script now runs successfully