llama-stack/docs/_static/llama-stack-spec.yaml
Francisco Arceo 8e7ab146f8
feat: Adding support for customizing chunk context in RAG insertion and querying (#2134)
# What does this PR do?
his PR allows users to customize the template used for chunks when
inserted into the context. Additionally, this enables metadata injection
into the context of an LLM for RAG. This makes a naive and crude
assumption that each chunk should include the metadata, this is
obviously redundant when multiple chunks are returned from the same
document. In order to remove any sort of duplication of chunks, we'd
have to make much more significant changes so this is a reasonable first
step that unblocks users requesting this enhancement in
https://github.com/meta-llama/llama-stack/issues/1767.

In the future, this can be extended to support citations.


List of Changes:
- `llama_stack/apis/tools/rag_tool.py`
    - Added  `chunk_template` field in `RAGQueryConfig`.
- Added `field_validator` to validate the `chunk_template` field in
`RAGQueryConfig`.
- Ensured the `chunk_template` field includes placeholders `{index}` and
`{chunk.content}`.
- Updated the `query` method to use the `chunk_template` for formatting
chunk text content.
- `llama_stack/providers/inline/tool_runtime/rag/memory.py`
- Modified the `insert` method to pass `doc.metadata` for chunk
creation.
- Enhanced the `query` method to format results using `chunk_template`
and exclude unnecessary metadata fields like `token_count`.
- `llama_stack/providers/utils/memory/vector_store.py`
- Updated `make_overlapped_chunks` to include metadata serialization and
token count for both content and metadata.
    - Added error handling for metadata serialization issues.
- `pyproject.toml`
- Added `pydantic.field_validator` as a recognized `classmethod`
decorator in the linting configuration.
- `tests/integration/tool_runtime/test_rag_tool.py`
- Refactored test assertions to separate `assert_valid_chunk_response`
and `assert_valid_text_response`.
- Added integration tests to validate `chunk_template` functionality
with and without metadata inclusion.
- Included a test case to ensure `chunk_template` validation errors are
raised appropriately.
- `tests/unit/rag/test_vector_store.py`
- Added unit tests for `make_overlapped_chunks`, verifying chunk
creation with overlapping tokens and metadata integrity.
- Added tests to handle metadata serialization errors, ensuring proper
exception handling.
- `docs/_static/llama-stack-spec.html`
- Added a new `chunk_template` field of type `string` with a default
template for formatting retrieved chunks in RAGQueryConfig.
    - Updated the `required` fields to include `chunk_template`.
- `docs/_static/llama-stack-spec.yaml`
- Introduced `chunk_template` field with a default value for
RAGQueryConfig.
- Updated the required configuration list to include `chunk_template`.
- `docs/source/building_applications/rag.md`
- Documented the `chunk_template` configuration, explaining how to
customize metadata formatting in RAG queries.
- Added examples demonstrating the usage of the `chunk_template` field
in RAG tool queries.
    - Highlighted default values for `RAG` agent configurations.

# Resolves https://github.com/meta-llama/llama-stack/issues/1767

## Test Plan
Updated both `test_vector_store.py` and `test_rag_tool.py` and tested
end-to-end with a script.

I also tested the quickstart to enable this and specified this metadata:
```python
document = RAGDocument(
    document_id="document_1",
    content=source,
    mime_type="text/html",
    metadata={"author": "Paul Graham", "title": "How to do great work"},
)
```
Which produced the output below: 

![Screenshot 2025-05-13 at 10 53
43 PM](https://github.com/user-attachments/assets/bb199d04-501e-4217-9c44-4699d43d5519)

This highlights the usefulness of the additional metadata. Notice how
the metadata is redundant for different chunks of the same document. I
think we can update that in a subsequent PR.

# Documentation
I've added a brief comment about this in the documentation to outline
this to users and updated the API documentation.

---------

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-05-14 21:56:20 -04:00

8764 lines
243 KiB
YAML

openapi: 3.1.0
info:
title: Llama Stack Specification
version: v1
description: >-
This is the specification of the Llama Stack that provides
a set of endpoints and their corresponding interfaces that are
tailored to
best leverage Llama Models.
servers:
- url: http://any-hosted-llama-stack.com
paths:
/v1/datasetio/append-rows/{dataset_id}:
post:
responses:
'200':
description: OK
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- DatasetIO
description: ''
parameters:
- name: dataset_id
in: path
required: true
schema:
type: string
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/AppendRowsRequest'
required: true
/v1/inference/batch-chat-completion:
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/BatchChatCompletionResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Inference
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/BatchChatCompletionRequest'
required: true
/v1/inference/batch-completion:
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/BatchCompletionResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Inference
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/BatchCompletionRequest'
required: true
/v1/post-training/job/cancel:
post:
responses:
'200':
description: OK
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- PostTraining (Coming Soon)
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/CancelTrainingJobRequest'
required: true
/v1/inference/chat-completion:
post:
responses:
'200':
description: >-
If stream=False, returns a ChatCompletionResponse with the full completion.
If stream=True, returns an SSE event stream of ChatCompletionResponseStreamChunk
content:
application/json:
schema:
$ref: '#/components/schemas/ChatCompletionResponse'
text/event-stream:
schema:
$ref: '#/components/schemas/ChatCompletionResponseStreamChunk'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- BatchInference (Coming Soon)
description: >-
Generate a chat completion for the given messages using the specified model.
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/ChatCompletionRequest'
required: true
/v1/inference/completion:
post:
responses:
'200':
description: >-
If stream=False, returns a CompletionResponse with the full completion.
If stream=True, returns an SSE event stream of CompletionResponseStreamChunk
content:
application/json:
schema:
$ref: '#/components/schemas/CompletionResponse'
text/event-stream:
schema:
$ref: '#/components/schemas/CompletionResponseStreamChunk'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- BatchInference (Coming Soon)
description: >-
Generate a completion for the given content using the specified model.
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/CompletionRequest'
required: true
/v1/agents:
get:
responses:
'200':
description: A PaginatedResponse.
content:
application/json:
schema:
$ref: '#/components/schemas/PaginatedResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Agents
description: List all agents.
parameters:
- name: start_index
in: query
description: The index to start the pagination from.
required: false
schema:
type: integer
- name: limit
in: query
description: The number of agents to return.
required: false
schema:
type: integer
post:
responses:
'200':
description: >-
An AgentCreateResponse with the agent ID.
content:
application/json:
schema:
$ref: '#/components/schemas/AgentCreateResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Agents
description: >-
Create an agent with the given configuration.
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/CreateAgentRequest'
required: true
/v1/agents/{agent_id}/session:
post:
responses:
'200':
description: An AgentSessionCreateResponse.
content:
application/json:
schema:
$ref: '#/components/schemas/AgentSessionCreateResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Agents
description: Create a new session for an agent.
parameters:
- name: agent_id
in: path
description: >-
The ID of the agent to create the session for.
required: true
schema:
type: string
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/CreateAgentSessionRequest'
required: true
/v1/agents/{agent_id}/session/{session_id}/turn:
post:
responses:
'200':
description: >-
If stream=False, returns a Turn object. If stream=True, returns an SSE
event stream of AgentTurnResponseStreamChunk
content:
application/json:
schema:
$ref: '#/components/schemas/Turn'
text/event-stream:
schema:
$ref: '#/components/schemas/AgentTurnResponseStreamChunk'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Agents
description: Create a new turn for an agent.
parameters:
- name: agent_id
in: path
description: >-
The ID of the agent to create the turn for.
required: true
schema:
type: string
- name: session_id
in: path
description: >-
The ID of the session to create the turn for.
required: true
schema:
type: string
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/CreateAgentTurnRequest'
required: true
/v1/openai/v1/responses:
post:
responses:
'200':
description: >-
Runtime representation of an annotated type.
content:
application/json:
schema:
$ref: '#/components/schemas/OpenAIResponseObject'
text/event-stream:
schema:
$ref: '#/components/schemas/OpenAIResponseObjectStream'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Agents
description: Create a new OpenAI response.
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/CreateOpenaiResponseRequest'
required: true
/v1/files:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ListBucketResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Files
description: List all buckets.
parameters:
- name: bucket
in: query
required: true
schema:
type: string
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/FileUploadResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Files
description: >-
Create a new upload session for a file identified by a bucket and key.
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/CreateUploadSessionRequest'
required: true
/v1/agents/{agent_id}:
get:
responses:
'200':
description: An Agent of the agent.
content:
application/json:
schema:
$ref: '#/components/schemas/Agent'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Agents
description: Describe an agent by its ID.
parameters:
- name: agent_id
in: path
description: ID of the agent.
required: true
schema:
type: string
delete:
responses:
'200':
description: OK
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Agents
description: >-
Delete an agent by its ID and its associated sessions and turns.
parameters:
- name: agent_id
in: path
description: The ID of the agent to delete.
required: true
schema:
type: string
/v1/agents/{agent_id}/session/{session_id}:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/Session'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Agents
description: Retrieve an agent session by its ID.
parameters:
- name: session_id
in: path
description: The ID of the session to get.
required: true
schema:
type: string
- name: agent_id
in: path
description: >-
The ID of the agent to get the session for.
required: true
schema:
type: string
- name: turn_ids
in: query
description: >-
(Optional) List of turn IDs to filter the session by.
required: false
schema:
type: array
items:
type: string
delete:
responses:
'200':
description: OK
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Agents
description: >-
Delete an agent session by its ID and its associated turns.
parameters:
- name: session_id
in: path
description: The ID of the session to delete.
required: true
schema:
type: string
- name: agent_id
in: path
description: >-
The ID of the agent to delete the session for.
required: true
schema:
type: string
/v1/files/{bucket}/{key}:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/FileResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Files
description: >-
Get a file info identified by a bucket and key.
parameters:
- name: bucket
in: path
description: 'Bucket name (valid chars: a-zA-Z0-9_-)'
required: true
schema:
type: string
- name: key
in: path
description: >-
Key under which the file is stored (valid chars: a-zA-Z0-9_-/.)
required: true
schema:
type: string
delete:
responses:
'200':
description: OK
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Files
description: >-
Delete a file identified by a bucket and key.
parameters:
- name: bucket
in: path
description: 'Bucket name (valid chars: a-zA-Z0-9_-)'
required: true
schema:
type: string
- name: key
in: path
description: >-
Key under which the file is stored (valid chars: a-zA-Z0-9_-/.)
required: true
schema:
type: string
/v1/inference/embeddings:
post:
responses:
'200':
description: >-
An array of embeddings, one for each content. Each embedding is a list
of floats. The dimensionality of the embedding is model-specific; you
can check model metadata using /models/{model_id}
content:
application/json:
schema:
$ref: '#/components/schemas/EmbeddingsResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Inference
description: >-
Generate embeddings for content pieces using the specified model.
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/EmbeddingsRequest'
required: true
/v1/eval/benchmarks/{benchmark_id}/evaluations:
post:
responses:
'200':
description: >-
EvaluateResponse object containing generations and scores
content:
application/json:
schema:
$ref: '#/components/schemas/EvaluateResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Eval
description: Evaluate a list of rows on a benchmark.
parameters:
- name: benchmark_id
in: path
description: >-
The ID of the benchmark to run the evaluation on.
required: true
schema:
type: string
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/EvaluateRowsRequest'
required: true
/v1/agents/{agent_id}/session/{session_id}/turn/{turn_id}/step/{step_id}:
get:
responses:
'200':
description: An AgentStepResponse.
content:
application/json:
schema:
$ref: '#/components/schemas/AgentStepResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Agents
description: Retrieve an agent step by its ID.
parameters:
- name: agent_id
in: path
description: The ID of the agent to get the step for.
required: true
schema:
type: string
- name: session_id
in: path
description: >-
The ID of the session to get the step for.
required: true
schema:
type: string
- name: turn_id
in: path
description: The ID of the turn to get the step for.
required: true
schema:
type: string
- name: step_id
in: path
description: The ID of the step to get.
required: true
schema:
type: string
/v1/agents/{agent_id}/session/{session_id}/turn/{turn_id}:
get:
responses:
'200':
description: A Turn.
content:
application/json:
schema:
$ref: '#/components/schemas/Turn'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Agents
description: Retrieve an agent turn by its ID.
parameters:
- name: agent_id
in: path
description: The ID of the agent to get the turn for.
required: true
schema:
type: string
- name: session_id
in: path
description: >-
The ID of the session to get the turn for.
required: true
schema:
type: string
- name: turn_id
in: path
description: The ID of the turn to get.
required: true
schema:
type: string
/v1/eval/benchmarks/{benchmark_id}:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/Benchmark'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Benchmarks
description: ''
parameters:
- name: benchmark_id
in: path
required: true
schema:
type: string
/v1/datasets/{dataset_id}:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/Dataset'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Datasets
description: ''
parameters:
- name: dataset_id
in: path
required: true
schema:
type: string
delete:
responses:
'200':
description: OK
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Datasets
description: ''
parameters:
- name: dataset_id
in: path
required: true
schema:
type: string
/v1/models/{model_id}:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/Model'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Models
description: ''
parameters:
- name: model_id
in: path
required: true
schema:
type: string
delete:
responses:
'200':
description: OK
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Models
description: ''
parameters:
- name: model_id
in: path
required: true
schema:
type: string
/v1/openai/v1/responses/{id}:
get:
responses:
'200':
description: An OpenAIResponseObject.
content:
application/json:
schema:
$ref: '#/components/schemas/OpenAIResponseObject'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Agents
description: Retrieve an OpenAI response by its ID.
parameters:
- name: id
in: path
description: >-
The ID of the OpenAI response to retrieve.
required: true
schema:
type: string
/v1/scoring-functions/{scoring_fn_id}:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ScoringFn'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- ScoringFunctions
description: ''
parameters:
- name: scoring_fn_id
in: path
required: true
schema:
type: string
/v1/shields/{identifier}:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/Shield'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Shields
description: ''
parameters:
- name: identifier
in: path
required: true
schema:
type: string
/v1/telemetry/traces/{trace_id}/spans/{span_id}:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/Span'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Telemetry
description: ''
parameters:
- name: trace_id
in: path
required: true
schema:
type: string
- name: span_id
in: path
required: true
schema:
type: string
/v1/telemetry/spans/{span_id}/tree:
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/QuerySpanTreeResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Telemetry
description: ''
parameters:
- name: span_id
in: path
required: true
schema:
type: string
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/GetSpanTreeRequest'
required: true
/v1/tools/{tool_name}:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/Tool'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- ToolGroups
description: ''
parameters:
- name: tool_name
in: path
required: true
schema:
type: string
/v1/toolgroups/{toolgroup_id}:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ToolGroup'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- ToolGroups
description: ''
parameters:
- name: toolgroup_id
in: path
required: true
schema:
type: string
delete:
responses:
'200':
description: OK
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- ToolGroups
description: Unregister a tool group
parameters:
- name: toolgroup_id
in: path
required: true
schema:
type: string
/v1/telemetry/traces/{trace_id}:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/Trace'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Telemetry
description: ''
parameters:
- name: trace_id
in: path
required: true
schema:
type: string
/v1/post-training/job/artifacts:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/PostTrainingJobArtifactsResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- PostTraining (Coming Soon)
description: ''
parameters:
- name: job_uuid
in: query
required: true
schema:
type: string
/v1/post-training/job/status:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/PostTrainingJobStatusResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- PostTraining (Coming Soon)
description: ''
parameters:
- name: job_uuid
in: query
required: true
schema:
type: string
/v1/post-training/jobs:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ListPostTrainingJobsResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- PostTraining (Coming Soon)
description: ''
parameters: []
/v1/files/session:{upload_id}:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/FileUploadResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Files
description: >-
Returns information about an existsing upload session
parameters:
- name: upload_id
in: path
description: ID of the upload session
required: true
schema:
type: string
post:
responses:
'200':
description: OK
content:
application/json:
schema:
oneOf:
- $ref: '#/components/schemas/FileResponse'
- type: 'null'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Files
description: >-
Upload file content to an existing upload session. On the server, request
body will have the raw bytes that are uploaded.
parameters:
- name: upload_id
in: path
description: ID of the upload session
required: true
schema:
type: string
requestBody:
content:
application/octet-stream:
schema:
type: string
format: binary
required: true
/v1/vector-dbs/{vector_db_id}:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/VectorDB'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- VectorDBs
description: ''
parameters:
- name: vector_db_id
in: path
required: true
schema:
type: string
delete:
responses:
'200':
description: OK
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- VectorDBs
description: ''
parameters:
- name: vector_db_id
in: path
required: true
schema:
type: string
/v1/health:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/HealthInfo'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Inspect
description: ''
parameters: []
/v1/tool-runtime/rag-tool/insert:
post:
responses:
'200':
description: OK
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- ToolRuntime
description: >-
Index documents so they can be used by the RAG system
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/InsertRequest'
required: true
/v1/vector-io/insert:
post:
responses:
'200':
description: OK
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- VectorIO
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/InsertChunksRequest'
required: true
/v1/providers/{provider_id}:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ProviderInfo'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Providers
description: ''
parameters:
- name: provider_id
in: path
required: true
schema:
type: string
/v1/tool-runtime/invoke:
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ToolInvocationResult'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- ToolRuntime
description: Run a tool with the given arguments
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/InvokeToolRequest'
required: true
/v1/datasetio/iterrows/{dataset_id}:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/PaginatedResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- DatasetIO
description: >-
Get a paginated list of rows from a dataset.
Uses offset-based pagination where:
- start_index: The starting index (0-based). If None, starts from beginning.
- limit: Number of items to return. If None or -1, returns all items.
The response includes:
- data: List of items for the current page
- has_more: Whether there are more items available after this set
parameters:
- name: dataset_id
in: path
description: >-
The ID of the dataset to get the rows from.
required: true
schema:
type: string
- name: start_index
in: query
description: >-
Index into dataset for the first row to get. Get all rows if None.
required: false
schema:
type: integer
- name: limit
in: query
description: The number of rows to get.
required: false
schema:
type: integer
/v1/eval/benchmarks/{benchmark_id}/jobs/{job_id}:
get:
responses:
'200':
description: The status of the evaluationjob.
content:
application/json:
schema:
$ref: '#/components/schemas/Job'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Eval
description: Get the status of a job.
parameters:
- name: benchmark_id
in: path
description: >-
The ID of the benchmark to run the evaluation on.
required: true
schema:
type: string
- name: job_id
in: path
description: The ID of the job to get the status of.
required: true
schema:
type: string
delete:
responses:
'200':
description: OK
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Eval
description: Cancel a job.
parameters:
- name: benchmark_id
in: path
description: >-
The ID of the benchmark to run the evaluation on.
required: true
schema:
type: string
- name: job_id
in: path
description: The ID of the job to cancel.
required: true
schema:
type: string
/v1/eval/benchmarks/{benchmark_id}/jobs/{job_id}/result:
get:
responses:
'200':
description: The result of the job.
content:
application/json:
schema:
$ref: '#/components/schemas/EvaluateResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Eval
description: Get the result of a job.
parameters:
- name: benchmark_id
in: path
description: >-
The ID of the benchmark to run the evaluation on.
required: true
schema:
type: string
- name: job_id
in: path
description: The ID of the job to get the result of.
required: true
schema:
type: string
/v1/agents/{agent_id}/sessions:
get:
responses:
'200':
description: A PaginatedResponse.
content:
application/json:
schema:
$ref: '#/components/schemas/PaginatedResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Agents
description: List all session(s) of a given agent.
parameters:
- name: agent_id
in: path
description: >-
The ID of the agent to list sessions for.
required: true
schema:
type: string
- name: start_index
in: query
description: The index to start the pagination from.
required: false
schema:
type: integer
- name: limit
in: query
description: The number of sessions to return.
required: false
schema:
type: integer
/v1/eval/benchmarks:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ListBenchmarksResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Benchmarks
description: ''
parameters: []
post:
responses:
'200':
description: OK
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Benchmarks
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/RegisterBenchmarkRequest'
required: true
/v1/datasets:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ListDatasetsResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Datasets
description: ''
parameters: []
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/Dataset'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Datasets
description: Register a new dataset.
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/RegisterDatasetRequest'
required: true
/v1/files/{bucket}:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ListFileResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Files
description: List all files in a bucket.
parameters:
- name: bucket
in: path
description: 'Bucket name (valid chars: a-zA-Z0-9_-)'
required: true
schema:
type: string
/v1/models:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ListModelsResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Models
description: ''
parameters: []
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/Model'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Models
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/RegisterModelRequest'
required: true
/v1/providers:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ListProvidersResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Providers
description: ''
parameters: []
/v1/inspect/routes:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ListRoutesResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Inspect
description: ''
parameters: []
/v1/tool-runtime/list-tools:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ListToolDefsResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- ToolRuntime
description: ''
parameters:
- name: tool_group_id
in: query
required: false
schema:
type: string
- name: mcp_endpoint
in: query
required: false
schema:
$ref: '#/components/schemas/URL'
/v1/scoring-functions:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ListScoringFunctionsResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- ScoringFunctions
description: ''
parameters: []
post:
responses:
'200':
description: OK
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- ScoringFunctions
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/RegisterScoringFunctionRequest'
required: true
/v1/shields:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ListShieldsResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Shields
description: ''
parameters: []
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/Shield'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Shields
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/RegisterShieldRequest'
required: true
/v1/toolgroups:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ListToolGroupsResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- ToolGroups
description: List tool groups with optional provider
parameters: []
post:
responses:
'200':
description: OK
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- ToolGroups
description: Register a tool group
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/RegisterToolGroupRequest'
required: true
/v1/tools:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ListToolsResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- ToolGroups
description: List tools with optional tool group
parameters:
- name: toolgroup_id
in: query
required: false
schema:
type: string
/v1/vector-dbs:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ListVectorDBsResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- VectorDBs
description: ''
parameters: []
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/VectorDB'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- VectorDBs
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/RegisterVectorDbRequest'
required: true
/v1/telemetry/events:
post:
responses:
'200':
description: OK
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Telemetry
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/LogEventRequest'
required: true
/v1/openai/v1/chat/completions:
post:
responses:
'200':
description: >-
Response from an OpenAI-compatible chat completion request. **OR** Chunk
from a streaming response to an OpenAI-compatible chat completion request.
content:
application/json:
schema:
oneOf:
- $ref: '#/components/schemas/OpenAIChatCompletion'
- $ref: '#/components/schemas/OpenAIChatCompletionChunk'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Inference
description: >-
Generate an OpenAI-compatible chat completion for the given messages using
the specified model.
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/OpenaiChatCompletionRequest'
required: true
/v1/openai/v1/completions:
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/OpenAICompletion'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Inference
description: >-
Generate an OpenAI-compatible completion for the given prompt using the specified
model.
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/OpenaiCompletionRequest'
required: true
/v1/openai/v1/models:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/OpenAIListModelsResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Models
description: ''
parameters: []
/v1/post-training/preference-optimize:
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/PostTrainingJob'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- PostTraining (Coming Soon)
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/PreferenceOptimizeRequest'
required: true
/v1/tool-runtime/rag-tool/query:
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/RAGQueryResult'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- ToolRuntime
description: >-
Query the RAG system for context; typically invoked by the agent
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/QueryRequest'
required: true
/v1/vector-io/query:
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/QueryChunksResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- VectorIO
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/QueryChunksRequest'
required: true
/v1/telemetry/metrics/{metric_name}:
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/QueryMetricsResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Telemetry
description: ''
parameters:
- name: metric_name
in: path
required: true
schema:
type: string
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/QueryMetricsRequest'
required: true
/v1/telemetry/spans:
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/QuerySpansResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Telemetry
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/QuerySpansRequest'
required: true
/v1/telemetry/traces:
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/QueryTracesResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Telemetry
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/QueryTracesRequest'
required: true
/v1/agents/{agent_id}/session/{session_id}/turn/{turn_id}/resume:
post:
responses:
'200':
description: >-
A Turn object if stream is False, otherwise an AsyncIterator of AgentTurnResponseStreamChunk
objects.
content:
application/json:
schema:
$ref: '#/components/schemas/Turn'
text/event-stream:
schema:
$ref: '#/components/schemas/AgentTurnResponseStreamChunk'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Agents
description: >-
Resume an agent turn with executed tool call responses.
When a Turn has the status `awaiting_input` due to pending input from client
side tool calls, this endpoint can be used to submit the outputs from the
tool calls once they are ready.
parameters:
- name: agent_id
in: path
description: The ID of the agent to resume.
required: true
schema:
type: string
- name: session_id
in: path
description: The ID of the session to resume.
required: true
schema:
type: string
- name: turn_id
in: path
description: The ID of the turn to resume.
required: true
schema:
type: string
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/ResumeAgentTurnRequest'
required: true
/v1/eval/benchmarks/{benchmark_id}/jobs:
post:
responses:
'200':
description: >-
The job that was created to run the evaluation.
content:
application/json:
schema:
$ref: '#/components/schemas/Job'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Eval
description: Run an evaluation on a benchmark.
parameters:
- name: benchmark_id
in: path
description: >-
The ID of the benchmark to run the evaluation on.
required: true
schema:
type: string
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/RunEvalRequest'
required: true
/v1/safety/run-shield:
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/RunShieldResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Safety
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/RunShieldRequest'
required: true
/v1/telemetry/spans/export:
post:
responses:
'200':
description: OK
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Telemetry
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/SaveSpansToDatasetRequest'
required: true
/v1/scoring/score:
post:
responses:
'200':
description: >-
ScoreResponse object containing rows and aggregated results
content:
application/json:
schema:
$ref: '#/components/schemas/ScoreResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Scoring
description: Score a list of rows.
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/ScoreRequest'
required: true
/v1/scoring/score-batch:
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/ScoreBatchResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Scoring
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/ScoreBatchRequest'
required: true
/v1/post-training/supervised-fine-tune:
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/PostTrainingJob'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- PostTraining (Coming Soon)
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/SupervisedFineTuneRequest'
required: true
/v1/synthetic-data-generation/generate:
post:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/SyntheticDataGenerationResponse'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- SyntheticDataGeneration (Coming Soon)
description: ''
parameters: []
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/SyntheticDataGenerateRequest'
required: true
/v1/version:
get:
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/VersionInfo'
'400':
$ref: '#/components/responses/BadRequest400'
'429':
$ref: >-
#/components/responses/TooManyRequests429
'500':
$ref: >-
#/components/responses/InternalServerError500
default:
$ref: '#/components/responses/DefaultError'
tags:
- Inspect
description: ''
parameters: []
jsonSchemaDialect: >-
https://json-schema.org/draft/2020-12/schema
components:
schemas:
Error:
type: object
properties:
status:
type: integer
description: HTTP status code
title:
type: string
description: >-
Error title, a short summary of the error which is invariant for an error
type
detail:
type: string
description: >-
Error detail, a longer human-readable description of the error
instance:
type: string
description: >-
(Optional) A URL which can be used to retrieve more information about
the specific occurrence of the error
additionalProperties: false
required:
- status
- title
- detail
title: Error
description: >-
Error response from the API. Roughly follows RFC 7807.
AppendRowsRequest:
type: object
properties:
rows:
type: array
items:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- rows
title: AppendRowsRequest
CompletionMessage:
type: object
properties:
role:
type: string
const: assistant
default: assistant
description: >-
Must be "assistant" to identify this as the model's response
content:
$ref: '#/components/schemas/InterleavedContent'
description: The content of the model's response
stop_reason:
type: string
enum:
- end_of_turn
- end_of_message
- out_of_tokens
description: >-
Reason why the model stopped generating. Options are: - `StopReason.end_of_turn`:
The model finished generating the entire response. - `StopReason.end_of_message`:
The model finished generating but generated a partial response -- usually,
a tool call. The user may call the tool and continue the conversation
with the tool's response. - `StopReason.out_of_tokens`: The model ran
out of token budget.
tool_calls:
type: array
items:
$ref: '#/components/schemas/ToolCall'
description: >-
List of tool calls. Each tool call is a ToolCall object.
additionalProperties: false
required:
- role
- content
- stop_reason
title: CompletionMessage
description: >-
A message containing the model's (assistant) response in a chat conversation.
GrammarResponseFormat:
type: object
properties:
type:
type: string
enum:
- json_schema
- grammar
description: >-
Must be "grammar" to identify this format type
const: grammar
default: grammar
bnf:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
description: >-
The BNF grammar specification the response should conform to
additionalProperties: false
required:
- type
- bnf
title: GrammarResponseFormat
description: >-
Configuration for grammar-guided response generation.
GreedySamplingStrategy:
type: object
properties:
type:
type: string
const: greedy
default: greedy
additionalProperties: false
required:
- type
title: GreedySamplingStrategy
ImageContentItem:
type: object
properties:
type:
type: string
const: image
default: image
description: >-
Discriminator type of the content item. Always "image"
image:
type: object
properties:
url:
$ref: '#/components/schemas/URL'
description: >-
A URL of the image or data URL in the format of data:image/{type};base64,{data}.
Note that URL could have length limits.
data:
type: string
contentEncoding: base64
description: base64 encoded image data as string
additionalProperties: false
description: >-
Image as a base64 encoded string or an URL
additionalProperties: false
required:
- type
- image
title: ImageContentItem
description: A image content item
InterleavedContent:
oneOf:
- type: string
- $ref: '#/components/schemas/InterleavedContentItem'
- type: array
items:
$ref: '#/components/schemas/InterleavedContentItem'
InterleavedContentItem:
oneOf:
- $ref: '#/components/schemas/ImageContentItem'
- $ref: '#/components/schemas/TextContentItem'
discriminator:
propertyName: type
mapping:
image: '#/components/schemas/ImageContentItem'
text: '#/components/schemas/TextContentItem'
JsonSchemaResponseFormat:
type: object
properties:
type:
type: string
enum:
- json_schema
- grammar
description: >-
Must be "json_schema" to identify this format type
const: json_schema
default: json_schema
json_schema:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
description: >-
The JSON schema the response should conform to. In a Python SDK, this
is often a `pydantic` model.
additionalProperties: false
required:
- type
- json_schema
title: JsonSchemaResponseFormat
description: >-
Configuration for JSON schema-guided response generation.
Message:
oneOf:
- $ref: '#/components/schemas/UserMessage'
- $ref: '#/components/schemas/SystemMessage'
- $ref: '#/components/schemas/ToolResponseMessage'
- $ref: '#/components/schemas/CompletionMessage'
discriminator:
propertyName: role
mapping:
user: '#/components/schemas/UserMessage'
system: '#/components/schemas/SystemMessage'
tool: '#/components/schemas/ToolResponseMessage'
assistant: '#/components/schemas/CompletionMessage'
ResponseFormat:
oneOf:
- $ref: '#/components/schemas/JsonSchemaResponseFormat'
- $ref: '#/components/schemas/GrammarResponseFormat'
discriminator:
propertyName: type
mapping:
json_schema: '#/components/schemas/JsonSchemaResponseFormat'
grammar: '#/components/schemas/GrammarResponseFormat'
SamplingParams:
type: object
properties:
strategy:
$ref: '#/components/schemas/SamplingStrategy'
description: The sampling strategy.
max_tokens:
type: integer
default: 0
description: >-
The maximum number of tokens that can be generated in the completion.
The token count of your prompt plus max_tokens cannot exceed the model's
context length.
repetition_penalty:
type: number
default: 1.0
description: >-
Number between -2.0 and 2.0. Positive values penalize new tokens based
on whether they appear in the text so far, increasing the model's likelihood
to talk about new topics.
stop:
type: array
items:
type: string
description: >-
Up to 4 sequences where the API will stop generating further tokens. The
returned text will not contain the stop sequence.
additionalProperties: false
required:
- strategy
title: SamplingParams
description: Sampling parameters.
SamplingStrategy:
oneOf:
- $ref: '#/components/schemas/GreedySamplingStrategy'
- $ref: '#/components/schemas/TopPSamplingStrategy'
- $ref: '#/components/schemas/TopKSamplingStrategy'
discriminator:
propertyName: type
mapping:
greedy: '#/components/schemas/GreedySamplingStrategy'
top_p: '#/components/schemas/TopPSamplingStrategy'
top_k: '#/components/schemas/TopKSamplingStrategy'
SystemMessage:
type: object
properties:
role:
type: string
const: system
default: system
description: >-
Must be "system" to identify this as a system message
content:
$ref: '#/components/schemas/InterleavedContent'
description: >-
The content of the "system prompt". If multiple system messages are provided,
they are concatenated. The underlying Llama Stack code may also add other
system messages (for example, for formatting tool definitions).
additionalProperties: false
required:
- role
- content
title: SystemMessage
description: >-
A system message providing instructions or context to the model.
TextContentItem:
type: object
properties:
type:
type: string
const: text
default: text
description: >-
Discriminator type of the content item. Always "text"
text:
type: string
description: Text content
additionalProperties: false
required:
- type
- text
title: TextContentItem
description: A text content item
ToolCall:
type: object
properties:
call_id:
type: string
tool_name:
oneOf:
- type: string
enum:
- brave_search
- wolfram_alpha
- photogen
- code_interpreter
title: BuiltinTool
- type: string
arguments:
oneOf:
- type: string
- type: object
additionalProperties:
oneOf:
- type: string
- type: integer
- type: number
- type: boolean
- type: 'null'
- type: array
items:
oneOf:
- type: string
- type: integer
- type: number
- type: boolean
- type: 'null'
- type: object
additionalProperties:
oneOf:
- type: string
- type: integer
- type: number
- type: boolean
- type: 'null'
arguments_json:
type: string
additionalProperties: false
required:
- call_id
- tool_name
- arguments
title: ToolCall
ToolConfig:
type: object
properties:
tool_choice:
oneOf:
- type: string
enum:
- auto
- required
- none
title: ToolChoice
description: >-
Whether tool use is required or automatic. This is a hint to the model
which may not be followed. It depends on the Instruction Following
capabilities of the model.
- type: string
default: auto
description: >-
(Optional) Whether tool use is automatic, required, or none. Can also
specify a tool name to use a specific tool. Defaults to ToolChoice.auto.
tool_prompt_format:
type: string
enum:
- json
- function_tag
- python_list
description: >-
(Optional) Instructs the model how to format tool calls. By default, Llama
Stack will attempt to use a format that is best adapted to the model.
- `ToolPromptFormat.json`: The tool calls are formatted as a JSON object.
- `ToolPromptFormat.function_tag`: The tool calls are enclosed in a <function=function_name>
tag. - `ToolPromptFormat.python_list`: The tool calls are output as Python
syntax -- a list of function calls.
system_message_behavior:
type: string
enum:
- append
- replace
description: >-
(Optional) Config for how to override the default system prompt. - `SystemMessageBehavior.append`:
Appends the provided system message to the default system prompt. - `SystemMessageBehavior.replace`:
Replaces the default system prompt with the provided system message. The
system message can include the string '{{function_definitions}}' to indicate
where the function definitions should be inserted.
default: append
additionalProperties: false
title: ToolConfig
description: Configuration for tool use.
ToolDefinition:
type: object
properties:
tool_name:
oneOf:
- type: string
enum:
- brave_search
- wolfram_alpha
- photogen
- code_interpreter
title: BuiltinTool
- type: string
description:
type: string
parameters:
type: object
additionalProperties:
$ref: '#/components/schemas/ToolParamDefinition'
additionalProperties: false
required:
- tool_name
title: ToolDefinition
ToolParamDefinition:
type: object
properties:
param_type:
type: string
description:
type: string
required:
type: boolean
default: true
default:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- param_type
title: ToolParamDefinition
ToolResponseMessage:
type: object
properties:
role:
type: string
const: tool
default: tool
description: >-
Must be "tool" to identify this as a tool response
call_id:
type: string
description: >-
Unique identifier for the tool call this response is for
content:
$ref: '#/components/schemas/InterleavedContent'
description: The response content from the tool
additionalProperties: false
required:
- role
- call_id
- content
title: ToolResponseMessage
description: >-
A message representing the result of a tool invocation.
TopKSamplingStrategy:
type: object
properties:
type:
type: string
const: top_k
default: top_k
top_k:
type: integer
additionalProperties: false
required:
- type
- top_k
title: TopKSamplingStrategy
TopPSamplingStrategy:
type: object
properties:
type:
type: string
const: top_p
default: top_p
temperature:
type: number
top_p:
type: number
default: 0.95
additionalProperties: false
required:
- type
title: TopPSamplingStrategy
URL:
type: object
properties:
uri:
type: string
additionalProperties: false
required:
- uri
title: URL
UserMessage:
type: object
properties:
role:
type: string
const: user
default: user
description: >-
Must be "user" to identify this as a user message
content:
$ref: '#/components/schemas/InterleavedContent'
description: >-
The content of the message, which can include text and other media
context:
$ref: '#/components/schemas/InterleavedContent'
description: >-
(Optional) This field is used internally by Llama Stack to pass RAG context.
This field may be removed in the API in the future.
additionalProperties: false
required:
- role
- content
title: UserMessage
description: >-
A message from the user in a chat conversation.
BatchChatCompletionRequest:
type: object
properties:
model_id:
type: string
messages_batch:
type: array
items:
type: array
items:
$ref: '#/components/schemas/Message'
sampling_params:
$ref: '#/components/schemas/SamplingParams'
tools:
type: array
items:
$ref: '#/components/schemas/ToolDefinition'
tool_config:
$ref: '#/components/schemas/ToolConfig'
response_format:
$ref: '#/components/schemas/ResponseFormat'
logprobs:
type: object
properties:
top_k:
type: integer
default: 0
description: >-
How many tokens (for each position) to return log probabilities for.
additionalProperties: false
title: LogProbConfig
additionalProperties: false
required:
- model_id
- messages_batch
title: BatchChatCompletionRequest
BatchChatCompletionResponse:
type: object
properties:
batch:
type: array
items:
$ref: '#/components/schemas/ChatCompletionResponse'
additionalProperties: false
required:
- batch
title: BatchChatCompletionResponse
ChatCompletionResponse:
type: object
properties:
metrics:
type: array
items:
$ref: '#/components/schemas/MetricInResponse'
completion_message:
$ref: '#/components/schemas/CompletionMessage'
description: The complete response message
logprobs:
type: array
items:
$ref: '#/components/schemas/TokenLogProbs'
description: >-
Optional log probabilities for generated tokens
additionalProperties: false
required:
- completion_message
title: ChatCompletionResponse
description: Response from a chat completion request.
MetricInResponse:
type: object
properties:
metric:
type: string
value:
oneOf:
- type: integer
- type: number
unit:
type: string
additionalProperties: false
required:
- metric
- value
title: MetricInResponse
TokenLogProbs:
type: object
properties:
logprobs_by_token:
type: object
additionalProperties:
type: number
description: >-
Dictionary mapping tokens to their log probabilities
additionalProperties: false
required:
- logprobs_by_token
title: TokenLogProbs
description: Log probabilities for generated tokens.
BatchCompletionRequest:
type: object
properties:
model_id:
type: string
content_batch:
type: array
items:
$ref: '#/components/schemas/InterleavedContent'
sampling_params:
$ref: '#/components/schemas/SamplingParams'
response_format:
$ref: '#/components/schemas/ResponseFormat'
logprobs:
type: object
properties:
top_k:
type: integer
default: 0
description: >-
How many tokens (for each position) to return log probabilities for.
additionalProperties: false
title: LogProbConfig
additionalProperties: false
required:
- model_id
- content_batch
title: BatchCompletionRequest
BatchCompletionResponse:
type: object
properties:
batch:
type: array
items:
$ref: '#/components/schemas/CompletionResponse'
additionalProperties: false
required:
- batch
title: BatchCompletionResponse
CompletionResponse:
type: object
properties:
metrics:
type: array
items:
$ref: '#/components/schemas/MetricInResponse'
content:
type: string
description: The generated completion text
stop_reason:
type: string
enum:
- end_of_turn
- end_of_message
- out_of_tokens
description: Reason why generation stopped
logprobs:
type: array
items:
$ref: '#/components/schemas/TokenLogProbs'
description: >-
Optional log probabilities for generated tokens
additionalProperties: false
required:
- content
- stop_reason
title: CompletionResponse
description: Response from a completion request.
CancelTrainingJobRequest:
type: object
properties:
job_uuid:
type: string
additionalProperties: false
required:
- job_uuid
title: CancelTrainingJobRequest
ChatCompletionRequest:
type: object
properties:
model_id:
type: string
description: >-
The identifier of the model to use. The model must be registered with
Llama Stack and available via the /models endpoint.
messages:
type: array
items:
$ref: '#/components/schemas/Message'
description: List of messages in the conversation
sampling_params:
$ref: '#/components/schemas/SamplingParams'
description: >-
Parameters to control the sampling strategy
tools:
type: array
items:
$ref: '#/components/schemas/ToolDefinition'
description: >-
(Optional) List of tool definitions available to the model
tool_choice:
type: string
enum:
- auto
- required
- none
description: >-
(Optional) Whether tool use is required or automatic. Defaults to ToolChoice.auto.
.. deprecated:: Use tool_config instead.
tool_prompt_format:
type: string
enum:
- json
- function_tag
- python_list
description: >-
(Optional) Instructs the model how to format tool calls. By default, Llama
Stack will attempt to use a format that is best adapted to the model.
- `ToolPromptFormat.json`: The tool calls are formatted as a JSON object.
- `ToolPromptFormat.function_tag`: The tool calls are enclosed in a <function=function_name>
tag. - `ToolPromptFormat.python_list`: The tool calls are output as Python
syntax -- a list of function calls. .. deprecated:: Use tool_config instead.
response_format:
$ref: '#/components/schemas/ResponseFormat'
description: >-
(Optional) Grammar specification for guided (structured) decoding. There
are two options: - `ResponseFormat.json_schema`: The grammar is a JSON
schema. Most providers support this format. - `ResponseFormat.grammar`:
The grammar is a BNF grammar. This format is more flexible, but not all
providers support it.
stream:
type: boolean
description: >-
(Optional) If True, generate an SSE event stream of the response. Defaults
to False.
logprobs:
type: object
properties:
top_k:
type: integer
default: 0
description: >-
How many tokens (for each position) to return log probabilities for.
additionalProperties: false
description: >-
(Optional) If specified, log probabilities for each token position will
be returned.
tool_config:
$ref: '#/components/schemas/ToolConfig'
description: (Optional) Configuration for tool use.
additionalProperties: false
required:
- model_id
- messages
title: ChatCompletionRequest
ChatCompletionResponseEvent:
type: object
properties:
event_type:
type: string
enum:
- start
- complete
- progress
description: Type of the event
delta:
$ref: '#/components/schemas/ContentDelta'
description: >-
Content generated since last event. This can be one or more tokens, or
a tool call.
logprobs:
type: array
items:
$ref: '#/components/schemas/TokenLogProbs'
description: >-
Optional log probabilities for generated tokens
stop_reason:
type: string
enum:
- end_of_turn
- end_of_message
- out_of_tokens
description: >-
Optional reason why generation stopped, if complete
additionalProperties: false
required:
- event_type
- delta
title: ChatCompletionResponseEvent
description: >-
An event during chat completion generation.
ChatCompletionResponseStreamChunk:
type: object
properties:
metrics:
type: array
items:
$ref: '#/components/schemas/MetricInResponse'
event:
$ref: '#/components/schemas/ChatCompletionResponseEvent'
description: The event containing the new content
additionalProperties: false
required:
- event
title: ChatCompletionResponseStreamChunk
description: >-
A chunk of a streamed chat completion response.
ContentDelta:
oneOf:
- $ref: '#/components/schemas/TextDelta'
- $ref: '#/components/schemas/ImageDelta'
- $ref: '#/components/schemas/ToolCallDelta'
discriminator:
propertyName: type
mapping:
text: '#/components/schemas/TextDelta'
image: '#/components/schemas/ImageDelta'
tool_call: '#/components/schemas/ToolCallDelta'
ImageDelta:
type: object
properties:
type:
type: string
const: image
default: image
image:
type: string
contentEncoding: base64
additionalProperties: false
required:
- type
- image
title: ImageDelta
TextDelta:
type: object
properties:
type:
type: string
const: text
default: text
text:
type: string
additionalProperties: false
required:
- type
- text
title: TextDelta
ToolCallDelta:
type: object
properties:
type:
type: string
const: tool_call
default: tool_call
tool_call:
oneOf:
- type: string
- $ref: '#/components/schemas/ToolCall'
parse_status:
type: string
enum:
- started
- in_progress
- failed
- succeeded
title: ToolCallParseStatus
additionalProperties: false
required:
- type
- tool_call
- parse_status
title: ToolCallDelta
CompletionRequest:
type: object
properties:
model_id:
type: string
description: >-
The identifier of the model to use. The model must be registered with
Llama Stack and available via the /models endpoint.
content:
$ref: '#/components/schemas/InterleavedContent'
description: The content to generate a completion for
sampling_params:
$ref: '#/components/schemas/SamplingParams'
description: >-
(Optional) Parameters to control the sampling strategy
response_format:
$ref: '#/components/schemas/ResponseFormat'
description: >-
(Optional) Grammar specification for guided (structured) decoding
stream:
type: boolean
description: >-
(Optional) If True, generate an SSE event stream of the response. Defaults
to False.
logprobs:
type: object
properties:
top_k:
type: integer
default: 0
description: >-
How many tokens (for each position) to return log probabilities for.
additionalProperties: false
description: >-
(Optional) If specified, log probabilities for each token position will
be returned.
additionalProperties: false
required:
- model_id
- content
title: CompletionRequest
CompletionResponseStreamChunk:
type: object
properties:
metrics:
type: array
items:
$ref: '#/components/schemas/MetricInResponse'
delta:
type: string
description: >-
New content generated since last chunk. This can be one or more tokens.
stop_reason:
type: string
enum:
- end_of_turn
- end_of_message
- out_of_tokens
description: >-
Optional reason why generation stopped, if complete
logprobs:
type: array
items:
$ref: '#/components/schemas/TokenLogProbs'
description: >-
Optional log probabilities for generated tokens
additionalProperties: false
required:
- delta
title: CompletionResponseStreamChunk
description: >-
A chunk of a streamed completion response.
AgentConfig:
type: object
properties:
sampling_params:
$ref: '#/components/schemas/SamplingParams'
input_shields:
type: array
items:
type: string
output_shields:
type: array
items:
type: string
toolgroups:
type: array
items:
$ref: '#/components/schemas/AgentTool'
client_tools:
type: array
items:
$ref: '#/components/schemas/ToolDef'
tool_choice:
type: string
enum:
- auto
- required
- none
title: ToolChoice
description: >-
Whether tool use is required or automatic. This is a hint to the model
which may not be followed. It depends on the Instruction Following capabilities
of the model.
deprecated: true
tool_prompt_format:
type: string
enum:
- json
- function_tag
- python_list
title: ToolPromptFormat
description: >-
Prompt format for calling custom / zero shot tools.
deprecated: true
tool_config:
$ref: '#/components/schemas/ToolConfig'
max_infer_iters:
type: integer
default: 10
model:
type: string
description: >-
The model identifier to use for the agent
instructions:
type: string
description: The system instructions for the agent
name:
type: string
description: >-
Optional name for the agent, used in telemetry and identification
enable_session_persistence:
type: boolean
default: false
description: >-
Optional flag indicating whether session data has to be persisted
response_format:
$ref: '#/components/schemas/ResponseFormat'
description: Optional response format configuration
additionalProperties: false
required:
- model
- instructions
title: AgentConfig
description: Configuration for an agent.
AgentTool:
oneOf:
- type: string
- type: object
properties:
name:
type: string
args:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- name
- args
title: AgentToolGroupWithArgs
ToolDef:
type: object
properties:
name:
type: string
description:
type: string
parameters:
type: array
items:
$ref: '#/components/schemas/ToolParameter'
metadata:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- name
title: ToolDef
ToolParameter:
type: object
properties:
name:
type: string
parameter_type:
type: string
description:
type: string
required:
type: boolean
default: true
default:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- name
- parameter_type
- description
- required
title: ToolParameter
CreateAgentRequest:
type: object
properties:
agent_config:
$ref: '#/components/schemas/AgentConfig'
description: The configuration for the agent.
additionalProperties: false
required:
- agent_config
title: CreateAgentRequest
AgentCreateResponse:
type: object
properties:
agent_id:
type: string
additionalProperties: false
required:
- agent_id
title: AgentCreateResponse
CreateAgentSessionRequest:
type: object
properties:
session_name:
type: string
description: The name of the session to create.
additionalProperties: false
required:
- session_name
title: CreateAgentSessionRequest
AgentSessionCreateResponse:
type: object
properties:
session_id:
type: string
additionalProperties: false
required:
- session_id
title: AgentSessionCreateResponse
CreateAgentTurnRequest:
type: object
properties:
messages:
type: array
items:
oneOf:
- $ref: '#/components/schemas/UserMessage'
- $ref: '#/components/schemas/ToolResponseMessage'
description: List of messages to start the turn with.
stream:
type: boolean
description: >-
(Optional) If True, generate an SSE event stream of the response. Defaults
to False.
documents:
type: array
items:
type: object
properties:
content:
oneOf:
- type: string
- $ref: '#/components/schemas/InterleavedContentItem'
- type: array
items:
$ref: '#/components/schemas/InterleavedContentItem'
- $ref: '#/components/schemas/URL'
description: The content of the document.
mime_type:
type: string
description: The MIME type of the document.
additionalProperties: false
required:
- content
- mime_type
title: Document
description: A document to be used by an agent.
description: >-
(Optional) List of documents to create the turn with.
toolgroups:
type: array
items:
$ref: '#/components/schemas/AgentTool'
description: >-
(Optional) List of toolgroups to create the turn with, will be used in
addition to the agent's config toolgroups for the request.
tool_config:
$ref: '#/components/schemas/ToolConfig'
description: >-
(Optional) The tool configuration to create the turn with, will be used
to override the agent's tool_config.
additionalProperties: false
required:
- messages
title: CreateAgentTurnRequest
InferenceStep:
type: object
properties:
turn_id:
type: string
description: The ID of the turn.
step_id:
type: string
description: The ID of the step.
started_at:
type: string
format: date-time
description: The time the step started.
completed_at:
type: string
format: date-time
description: The time the step completed.
step_type:
type: string
enum:
- inference
- tool_execution
- shield_call
- memory_retrieval
title: StepType
description: Type of the step in an agent turn.
const: inference
default: inference
model_response:
$ref: '#/components/schemas/CompletionMessage'
description: The response from the LLM.
additionalProperties: false
required:
- turn_id
- step_id
- step_type
- model_response
title: InferenceStep
description: An inference step in an agent turn.
MemoryRetrievalStep:
type: object
properties:
turn_id:
type: string
description: The ID of the turn.
step_id:
type: string
description: The ID of the step.
started_at:
type: string
format: date-time
description: The time the step started.
completed_at:
type: string
format: date-time
description: The time the step completed.
step_type:
type: string
enum:
- inference
- tool_execution
- shield_call
- memory_retrieval
title: StepType
description: Type of the step in an agent turn.
const: memory_retrieval
default: memory_retrieval
vector_db_ids:
type: string
description: >-
The IDs of the vector databases to retrieve context from.
inserted_context:
$ref: '#/components/schemas/InterleavedContent'
description: >-
The context retrieved from the vector databases.
additionalProperties: false
required:
- turn_id
- step_id
- step_type
- vector_db_ids
- inserted_context
title: MemoryRetrievalStep
description: >-
A memory retrieval step in an agent turn.
SafetyViolation:
type: object
properties:
violation_level:
$ref: '#/components/schemas/ViolationLevel'
user_message:
type: string
metadata:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- violation_level
- metadata
title: SafetyViolation
ShieldCallStep:
type: object
properties:
turn_id:
type: string
description: The ID of the turn.
step_id:
type: string
description: The ID of the step.
started_at:
type: string
format: date-time
description: The time the step started.
completed_at:
type: string
format: date-time
description: The time the step completed.
step_type:
type: string
enum:
- inference
- tool_execution
- shield_call
- memory_retrieval
title: StepType
description: Type of the step in an agent turn.
const: shield_call
default: shield_call
violation:
$ref: '#/components/schemas/SafetyViolation'
description: The violation from the shield call.
additionalProperties: false
required:
- turn_id
- step_id
- step_type
title: ShieldCallStep
description: A shield call step in an agent turn.
ToolExecutionStep:
type: object
properties:
turn_id:
type: string
description: The ID of the turn.
step_id:
type: string
description: The ID of the step.
started_at:
type: string
format: date-time
description: The time the step started.
completed_at:
type: string
format: date-time
description: The time the step completed.
step_type:
type: string
enum:
- inference
- tool_execution
- shield_call
- memory_retrieval
title: StepType
description: Type of the step in an agent turn.
const: tool_execution
default: tool_execution
tool_calls:
type: array
items:
$ref: '#/components/schemas/ToolCall'
description: The tool calls to execute.
tool_responses:
type: array
items:
$ref: '#/components/schemas/ToolResponse'
description: The tool responses from the tool calls.
additionalProperties: false
required:
- turn_id
- step_id
- step_type
- tool_calls
- tool_responses
title: ToolExecutionStep
description: A tool execution step in an agent turn.
ToolResponse:
type: object
properties:
call_id:
type: string
tool_name:
oneOf:
- type: string
enum:
- brave_search
- wolfram_alpha
- photogen
- code_interpreter
title: BuiltinTool
- type: string
content:
$ref: '#/components/schemas/InterleavedContent'
metadata:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- call_id
- tool_name
- content
title: ToolResponse
Turn:
type: object
properties:
turn_id:
type: string
session_id:
type: string
input_messages:
type: array
items:
oneOf:
- $ref: '#/components/schemas/UserMessage'
- $ref: '#/components/schemas/ToolResponseMessage'
steps:
type: array
items:
oneOf:
- $ref: '#/components/schemas/InferenceStep'
- $ref: '#/components/schemas/ToolExecutionStep'
- $ref: '#/components/schemas/ShieldCallStep'
- $ref: '#/components/schemas/MemoryRetrievalStep'
discriminator:
propertyName: step_type
mapping:
inference: '#/components/schemas/InferenceStep'
tool_execution: '#/components/schemas/ToolExecutionStep'
shield_call: '#/components/schemas/ShieldCallStep'
memory_retrieval: '#/components/schemas/MemoryRetrievalStep'
output_message:
$ref: '#/components/schemas/CompletionMessage'
output_attachments:
type: array
items:
type: object
properties:
content:
oneOf:
- type: string
- $ref: '#/components/schemas/InterleavedContentItem'
- type: array
items:
$ref: '#/components/schemas/InterleavedContentItem'
- $ref: '#/components/schemas/URL'
description: The content of the attachment.
mime_type:
type: string
description: The MIME type of the attachment.
additionalProperties: false
required:
- content
- mime_type
title: Attachment
description: An attachment to an agent turn.
started_at:
type: string
format: date-time
completed_at:
type: string
format: date-time
additionalProperties: false
required:
- turn_id
- session_id
- input_messages
- steps
- output_message
- started_at
title: Turn
description: >-
A single turn in an interaction with an Agentic System.
ViolationLevel:
type: string
enum:
- info
- warn
- error
title: ViolationLevel
AgentTurnResponseEvent:
type: object
properties:
payload:
$ref: '#/components/schemas/AgentTurnResponseEventPayload'
additionalProperties: false
required:
- payload
title: AgentTurnResponseEvent
AgentTurnResponseEventPayload:
oneOf:
- $ref: '#/components/schemas/AgentTurnResponseStepStartPayload'
- $ref: '#/components/schemas/AgentTurnResponseStepProgressPayload'
- $ref: '#/components/schemas/AgentTurnResponseStepCompletePayload'
- $ref: '#/components/schemas/AgentTurnResponseTurnStartPayload'
- $ref: '#/components/schemas/AgentTurnResponseTurnCompletePayload'
- $ref: '#/components/schemas/AgentTurnResponseTurnAwaitingInputPayload'
discriminator:
propertyName: event_type
mapping:
step_start: '#/components/schemas/AgentTurnResponseStepStartPayload'
step_progress: '#/components/schemas/AgentTurnResponseStepProgressPayload'
step_complete: '#/components/schemas/AgentTurnResponseStepCompletePayload'
turn_start: '#/components/schemas/AgentTurnResponseTurnStartPayload'
turn_complete: '#/components/schemas/AgentTurnResponseTurnCompletePayload'
turn_awaiting_input: '#/components/schemas/AgentTurnResponseTurnAwaitingInputPayload'
AgentTurnResponseStepCompletePayload:
type: object
properties:
event_type:
type: string
enum:
- step_start
- step_complete
- step_progress
- turn_start
- turn_complete
- turn_awaiting_input
title: AgentTurnResponseEventType
const: step_complete
default: step_complete
step_type:
type: string
enum:
- inference
- tool_execution
- shield_call
- memory_retrieval
title: StepType
description: Type of the step in an agent turn.
step_id:
type: string
step_details:
oneOf:
- $ref: '#/components/schemas/InferenceStep'
- $ref: '#/components/schemas/ToolExecutionStep'
- $ref: '#/components/schemas/ShieldCallStep'
- $ref: '#/components/schemas/MemoryRetrievalStep'
discriminator:
propertyName: step_type
mapping:
inference: '#/components/schemas/InferenceStep'
tool_execution: '#/components/schemas/ToolExecutionStep'
shield_call: '#/components/schemas/ShieldCallStep'
memory_retrieval: '#/components/schemas/MemoryRetrievalStep'
additionalProperties: false
required:
- event_type
- step_type
- step_id
- step_details
title: AgentTurnResponseStepCompletePayload
AgentTurnResponseStepProgressPayload:
type: object
properties:
event_type:
type: string
enum:
- step_start
- step_complete
- step_progress
- turn_start
- turn_complete
- turn_awaiting_input
title: AgentTurnResponseEventType
const: step_progress
default: step_progress
step_type:
type: string
enum:
- inference
- tool_execution
- shield_call
- memory_retrieval
title: StepType
description: Type of the step in an agent turn.
step_id:
type: string
delta:
$ref: '#/components/schemas/ContentDelta'
additionalProperties: false
required:
- event_type
- step_type
- step_id
- delta
title: AgentTurnResponseStepProgressPayload
AgentTurnResponseStepStartPayload:
type: object
properties:
event_type:
type: string
enum:
- step_start
- step_complete
- step_progress
- turn_start
- turn_complete
- turn_awaiting_input
title: AgentTurnResponseEventType
const: step_start
default: step_start
step_type:
type: string
enum:
- inference
- tool_execution
- shield_call
- memory_retrieval
title: StepType
description: Type of the step in an agent turn.
step_id:
type: string
metadata:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- event_type
- step_type
- step_id
title: AgentTurnResponseStepStartPayload
AgentTurnResponseStreamChunk:
type: object
properties:
event:
$ref: '#/components/schemas/AgentTurnResponseEvent'
additionalProperties: false
required:
- event
title: AgentTurnResponseStreamChunk
description: streamed agent turn completion response.
"AgentTurnResponseTurnAwaitingInputPayload":
type: object
properties:
event_type:
type: string
enum:
- step_start
- step_complete
- step_progress
- turn_start
- turn_complete
- turn_awaiting_input
title: AgentTurnResponseEventType
const: turn_awaiting_input
default: turn_awaiting_input
turn:
$ref: '#/components/schemas/Turn'
additionalProperties: false
required:
- event_type
- turn
title: >-
AgentTurnResponseTurnAwaitingInputPayload
AgentTurnResponseTurnCompletePayload:
type: object
properties:
event_type:
type: string
enum:
- step_start
- step_complete
- step_progress
- turn_start
- turn_complete
- turn_awaiting_input
title: AgentTurnResponseEventType
const: turn_complete
default: turn_complete
turn:
$ref: '#/components/schemas/Turn'
additionalProperties: false
required:
- event_type
- turn
title: AgentTurnResponseTurnCompletePayload
AgentTurnResponseTurnStartPayload:
type: object
properties:
event_type:
type: string
enum:
- step_start
- step_complete
- step_progress
- turn_start
- turn_complete
- turn_awaiting_input
title: AgentTurnResponseEventType
const: turn_start
default: turn_start
turn_id:
type: string
additionalProperties: false
required:
- event_type
- turn_id
title: AgentTurnResponseTurnStartPayload
OpenAIResponseInput:
oneOf:
- $ref: '#/components/schemas/OpenAIResponseOutputMessageWebSearchToolCall'
- $ref: '#/components/schemas/OpenAIResponseOutputMessageFunctionToolCall'
- $ref: '#/components/schemas/OpenAIResponseInputFunctionToolCallOutput'
- $ref: '#/components/schemas/OpenAIResponseMessage'
"OpenAIResponseInputFunctionToolCallOutput":
type: object
properties:
call_id:
type: string
output:
type: string
type:
type: string
const: function_call_output
default: function_call_output
id:
type: string
status:
type: string
additionalProperties: false
required:
- call_id
- output
- type
title: >-
OpenAIResponseInputFunctionToolCallOutput
description: >-
This represents the output of a function call that gets passed back to the
model.
OpenAIResponseInputMessageContent:
oneOf:
- $ref: '#/components/schemas/OpenAIResponseInputMessageContentText'
- $ref: '#/components/schemas/OpenAIResponseInputMessageContentImage'
discriminator:
propertyName: type
mapping:
input_text: '#/components/schemas/OpenAIResponseInputMessageContentText'
input_image: '#/components/schemas/OpenAIResponseInputMessageContentImage'
OpenAIResponseInputMessageContentImage:
type: object
properties:
detail:
oneOf:
- type: string
const: low
- type: string
const: high
- type: string
const: auto
default: auto
type:
type: string
const: input_image
default: input_image
image_url:
type: string
additionalProperties: false
required:
- detail
- type
title: OpenAIResponseInputMessageContentImage
OpenAIResponseInputMessageContentText:
type: object
properties:
text:
type: string
type:
type: string
const: input_text
default: input_text
additionalProperties: false
required:
- text
- type
title: OpenAIResponseInputMessageContentText
OpenAIResponseInputTool:
oneOf:
- $ref: '#/components/schemas/OpenAIResponseInputToolWebSearch'
- $ref: '#/components/schemas/OpenAIResponseInputToolFileSearch'
- $ref: '#/components/schemas/OpenAIResponseInputToolFunction'
discriminator:
propertyName: type
mapping:
web_search: '#/components/schemas/OpenAIResponseInputToolWebSearch'
file_search: '#/components/schemas/OpenAIResponseInputToolFileSearch'
function: '#/components/schemas/OpenAIResponseInputToolFunction'
OpenAIResponseInputToolFileSearch:
type: object
properties:
type:
type: string
const: file_search
default: file_search
vector_store_id:
type: array
items:
type: string
ranking_options:
type: object
properties:
ranker:
type: string
score_threshold:
type: number
default: 0.0
additionalProperties: false
title: FileSearchRankingOptions
additionalProperties: false
required:
- type
- vector_store_id
title: OpenAIResponseInputToolFileSearch
OpenAIResponseInputToolFunction:
type: object
properties:
type:
type: string
const: function
default: function
name:
type: string
description:
type: string
parameters:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
strict:
type: boolean
additionalProperties: false
required:
- type
- name
title: OpenAIResponseInputToolFunction
OpenAIResponseInputToolWebSearch:
type: object
properties:
type:
oneOf:
- type: string
const: web_search
- type: string
const: web_search_preview_2025_03_11
default: web_search
search_context_size:
type: string
default: medium
additionalProperties: false
required:
- type
title: OpenAIResponseInputToolWebSearch
OpenAIResponseMessage:
type: object
properties:
content:
oneOf:
- type: string
- type: array
items:
$ref: '#/components/schemas/OpenAIResponseInputMessageContent'
- type: array
items:
$ref: '#/components/schemas/OpenAIResponseOutputMessageContent'
role:
oneOf:
- type: string
const: system
- type: string
const: developer
- type: string
const: user
- type: string
const: assistant
type:
type: string
const: message
default: message
id:
type: string
status:
type: string
additionalProperties: false
required:
- content
- role
- type
title: OpenAIResponseMessage
description: >-
Corresponds to the various Message types in the Responses API. They are all
under one type because the Responses API gives them all the same "type" value,
and there is no way to tell them apart in certain scenarios.
OpenAIResponseOutputMessageContent:
type: object
properties:
text:
type: string
type:
type: string
const: output_text
default: output_text
additionalProperties: false
required:
- text
- type
title: >-
OpenAIResponseOutputMessageContentOutputText
"OpenAIResponseOutputMessageFunctionToolCall":
type: object
properties:
arguments:
type: string
call_id:
type: string
name:
type: string
type:
type: string
const: function_call
default: function_call
id:
type: string
status:
type: string
additionalProperties: false
required:
- arguments
- call_id
- name
- type
- id
- status
title: >-
OpenAIResponseOutputMessageFunctionToolCall
"OpenAIResponseOutputMessageWebSearchToolCall":
type: object
properties:
id:
type: string
status:
type: string
type:
type: string
const: web_search_call
default: web_search_call
additionalProperties: false
required:
- id
- status
- type
title: >-
OpenAIResponseOutputMessageWebSearchToolCall
CreateOpenaiResponseRequest:
type: object
properties:
input:
oneOf:
- type: string
- type: array
items:
$ref: '#/components/schemas/OpenAIResponseInput'
description: Input message(s) to create the response.
model:
type: string
description: The underlying LLM used for completions.
previous_response_id:
type: string
description: >-
(Optional) if specified, the new response will be a continuation of the
previous response. This can be used to easily fork-off new responses from
existing responses.
store:
type: boolean
stream:
type: boolean
temperature:
type: number
tools:
type: array
items:
$ref: '#/components/schemas/OpenAIResponseInputTool'
additionalProperties: false
required:
- input
- model
title: CreateOpenaiResponseRequest
OpenAIResponseError:
type: object
properties:
code:
type: string
message:
type: string
additionalProperties: false
required:
- code
- message
title: OpenAIResponseError
OpenAIResponseObject:
type: object
properties:
created_at:
type: integer
error:
$ref: '#/components/schemas/OpenAIResponseError'
id:
type: string
model:
type: string
object:
type: string
const: response
default: response
output:
type: array
items:
$ref: '#/components/schemas/OpenAIResponseOutput'
parallel_tool_calls:
type: boolean
default: false
previous_response_id:
type: string
status:
type: string
temperature:
type: number
top_p:
type: number
truncation:
type: string
user:
type: string
additionalProperties: false
required:
- created_at
- id
- model
- object
- output
- parallel_tool_calls
- status
title: OpenAIResponseObject
OpenAIResponseOutput:
oneOf:
- $ref: '#/components/schemas/OpenAIResponseMessage'
- $ref: '#/components/schemas/OpenAIResponseOutputMessageWebSearchToolCall'
- $ref: '#/components/schemas/OpenAIResponseOutputMessageFunctionToolCall'
discriminator:
propertyName: type
mapping:
message: '#/components/schemas/OpenAIResponseMessage'
web_search_call: '#/components/schemas/OpenAIResponseOutputMessageWebSearchToolCall'
function_call: '#/components/schemas/OpenAIResponseOutputMessageFunctionToolCall'
OpenAIResponseObjectStream:
oneOf:
- $ref: '#/components/schemas/OpenAIResponseObjectStreamResponseCreated'
- $ref: '#/components/schemas/OpenAIResponseObjectStreamResponseCompleted'
discriminator:
propertyName: type
mapping:
response.created: '#/components/schemas/OpenAIResponseObjectStreamResponseCreated'
response.completed: '#/components/schemas/OpenAIResponseObjectStreamResponseCompleted'
"OpenAIResponseObjectStreamResponseCompleted":
type: object
properties:
response:
$ref: '#/components/schemas/OpenAIResponseObject'
type:
type: string
const: response.completed
default: response.completed
additionalProperties: false
required:
- response
- type
title: >-
OpenAIResponseObjectStreamResponseCompleted
"OpenAIResponseObjectStreamResponseCreated":
type: object
properties:
response:
$ref: '#/components/schemas/OpenAIResponseObject'
type:
type: string
const: response.created
default: response.created
additionalProperties: false
required:
- response
- type
title: >-
OpenAIResponseObjectStreamResponseCreated
CreateUploadSessionRequest:
type: object
properties:
bucket:
type: string
description: >-
Bucket under which the file is stored (valid chars: a-zA-Z0-9_-)
key:
type: string
description: >-
Key under which the file is stored (valid chars: a-zA-Z0-9_-/.)
mime_type:
type: string
description: MIME type of the file
size:
type: integer
description: File size in bytes
additionalProperties: false
required:
- bucket
- key
- mime_type
- size
title: CreateUploadSessionRequest
FileUploadResponse:
type: object
properties:
id:
type: string
description: ID of the upload session
url:
type: string
description: Upload URL for the file or file parts
offset:
type: integer
description: Upload content offset
size:
type: integer
description: Upload content size
additionalProperties: false
required:
- id
- url
- offset
- size
title: FileUploadResponse
description: >-
Response after initiating a file upload session.
EmbeddingsRequest:
type: object
properties:
model_id:
type: string
description: >-
The identifier of the model to use. The model must be an embedding model
registered with Llama Stack and available via the /models endpoint.
contents:
oneOf:
- type: array
items:
type: string
- type: array
items:
$ref: '#/components/schemas/InterleavedContentItem'
description: >-
List of contents to generate embeddings for. Each content can be a string
or an InterleavedContentItem (and hence can be multimodal). The behavior
depends on the model and provider. Some models may only support text.
text_truncation:
type: string
enum:
- none
- start
- end
description: >-
(Optional) Config for how to truncate text for embedding when text is
longer than the model's max sequence length.
output_dimension:
type: integer
description: >-
(Optional) Output dimensionality for the embeddings. Only supported by
Matryoshka models.
task_type:
type: string
enum:
- query
- document
description: >-
(Optional) How is the embedding being used? This is only supported by
asymmetric embedding models.
additionalProperties: false
required:
- model_id
- contents
title: EmbeddingsRequest
EmbeddingsResponse:
type: object
properties:
embeddings:
type: array
items:
type: array
items:
type: number
description: >-
List of embedding vectors, one per input content. Each embedding is a
list of floats. The dimensionality of the embedding is model-specific;
you can check model metadata using /models/{model_id}
additionalProperties: false
required:
- embeddings
title: EmbeddingsResponse
description: >-
Response containing generated embeddings.
AgentCandidate:
type: object
properties:
type:
type: string
const: agent
default: agent
config:
$ref: '#/components/schemas/AgentConfig'
description: >-
The configuration for the agent candidate.
additionalProperties: false
required:
- type
- config
title: AgentCandidate
description: An agent candidate for evaluation.
AggregationFunctionType:
type: string
enum:
- average
- weighted_average
- median
- categorical_count
- accuracy
title: AggregationFunctionType
BasicScoringFnParams:
type: object
properties:
type:
$ref: '#/components/schemas/ScoringFnParamsType'
const: basic
default: basic
aggregation_functions:
type: array
items:
$ref: '#/components/schemas/AggregationFunctionType'
additionalProperties: false
required:
- type
- aggregation_functions
title: BasicScoringFnParams
BenchmarkConfig:
type: object
properties:
eval_candidate:
$ref: '#/components/schemas/EvalCandidate'
description: The candidate to evaluate.
scoring_params:
type: object
additionalProperties:
$ref: '#/components/schemas/ScoringFnParams'
description: >-
Map between scoring function id and parameters for each scoring function
you want to run
num_examples:
type: integer
description: >-
(Optional) The number of examples to evaluate. If not provided, all examples
in the dataset will be evaluated
additionalProperties: false
required:
- eval_candidate
- scoring_params
title: BenchmarkConfig
description: >-
A benchmark configuration for evaluation.
EvalCandidate:
oneOf:
- $ref: '#/components/schemas/ModelCandidate'
- $ref: '#/components/schemas/AgentCandidate'
discriminator:
propertyName: type
mapping:
model: '#/components/schemas/ModelCandidate'
agent: '#/components/schemas/AgentCandidate'
LLMAsJudgeScoringFnParams:
type: object
properties:
type:
$ref: '#/components/schemas/ScoringFnParamsType'
const: llm_as_judge
default: llm_as_judge
judge_model:
type: string
prompt_template:
type: string
judge_score_regexes:
type: array
items:
type: string
aggregation_functions:
type: array
items:
$ref: '#/components/schemas/AggregationFunctionType'
additionalProperties: false
required:
- type
- judge_model
- judge_score_regexes
- aggregation_functions
title: LLMAsJudgeScoringFnParams
ModelCandidate:
type: object
properties:
type:
type: string
const: model
default: model
model:
type: string
description: The model ID to evaluate.
sampling_params:
$ref: '#/components/schemas/SamplingParams'
description: The sampling parameters for the model.
system_message:
$ref: '#/components/schemas/SystemMessage'
description: >-
(Optional) The system message providing instructions or context to the
model.
additionalProperties: false
required:
- type
- model
- sampling_params
title: ModelCandidate
description: A model candidate for evaluation.
RegexParserScoringFnParams:
type: object
properties:
type:
$ref: '#/components/schemas/ScoringFnParamsType'
const: regex_parser
default: regex_parser
parsing_regexes:
type: array
items:
type: string
aggregation_functions:
type: array
items:
$ref: '#/components/schemas/AggregationFunctionType'
additionalProperties: false
required:
- type
- parsing_regexes
- aggregation_functions
title: RegexParserScoringFnParams
ScoringFnParams:
oneOf:
- $ref: '#/components/schemas/LLMAsJudgeScoringFnParams'
- $ref: '#/components/schemas/RegexParserScoringFnParams'
- $ref: '#/components/schemas/BasicScoringFnParams'
discriminator:
propertyName: type
mapping:
llm_as_judge: '#/components/schemas/LLMAsJudgeScoringFnParams'
regex_parser: '#/components/schemas/RegexParserScoringFnParams'
basic: '#/components/schemas/BasicScoringFnParams'
ScoringFnParamsType:
type: string
enum:
- llm_as_judge
- regex_parser
- basic
title: ScoringFnParamsType
EvaluateRowsRequest:
type: object
properties:
input_rows:
type: array
items:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
description: The rows to evaluate.
scoring_functions:
type: array
items:
type: string
description: >-
The scoring functions to use for the evaluation.
benchmark_config:
$ref: '#/components/schemas/BenchmarkConfig'
description: The configuration for the benchmark.
additionalProperties: false
required:
- input_rows
- scoring_functions
- benchmark_config
title: EvaluateRowsRequest
EvaluateResponse:
type: object
properties:
generations:
type: array
items:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
description: The generations from the evaluation.
scores:
type: object
additionalProperties:
$ref: '#/components/schemas/ScoringResult'
description: The scores from the evaluation.
additionalProperties: false
required:
- generations
- scores
title: EvaluateResponse
description: The response from an evaluation.
ScoringResult:
type: object
properties:
score_rows:
type: array
items:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
description: >-
The scoring result for each row. Each row is a map of column name to value.
aggregated_results:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
description: Map of metric name to aggregated value
additionalProperties: false
required:
- score_rows
- aggregated_results
title: ScoringResult
description: A scoring result for a single row.
Agent:
type: object
properties:
agent_id:
type: string
agent_config:
$ref: '#/components/schemas/AgentConfig'
created_at:
type: string
format: date-time
additionalProperties: false
required:
- agent_id
- agent_config
- created_at
title: Agent
Session:
type: object
properties:
session_id:
type: string
session_name:
type: string
turns:
type: array
items:
$ref: '#/components/schemas/Turn'
started_at:
type: string
format: date-time
additionalProperties: false
required:
- session_id
- session_name
- turns
- started_at
title: Session
description: >-
A single session of an interaction with an Agentic System.
AgentStepResponse:
type: object
properties:
step:
oneOf:
- $ref: '#/components/schemas/InferenceStep'
- $ref: '#/components/schemas/ToolExecutionStep'
- $ref: '#/components/schemas/ShieldCallStep'
- $ref: '#/components/schemas/MemoryRetrievalStep'
discriminator:
propertyName: step_type
mapping:
inference: '#/components/schemas/InferenceStep'
tool_execution: '#/components/schemas/ToolExecutionStep'
shield_call: '#/components/schemas/ShieldCallStep'
memory_retrieval: '#/components/schemas/MemoryRetrievalStep'
additionalProperties: false
required:
- step
title: AgentStepResponse
Benchmark:
type: object
properties:
identifier:
type: string
provider_resource_id:
type: string
provider_id:
type: string
type:
type: string
enum:
- model
- shield
- vector_db
- dataset
- scoring_function
- benchmark
- tool
- tool_group
title: ResourceType
const: benchmark
default: benchmark
dataset_id:
type: string
scoring_functions:
type: array
items:
type: string
metadata:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- identifier
- provider_id
- type
- dataset_id
- scoring_functions
- metadata
title: Benchmark
DataSource:
oneOf:
- $ref: '#/components/schemas/URIDataSource'
- $ref: '#/components/schemas/RowsDataSource'
discriminator:
propertyName: type
mapping:
uri: '#/components/schemas/URIDataSource'
rows: '#/components/schemas/RowsDataSource'
Dataset:
type: object
properties:
identifier:
type: string
provider_resource_id:
type: string
provider_id:
type: string
type:
type: string
enum:
- model
- shield
- vector_db
- dataset
- scoring_function
- benchmark
- tool
- tool_group
title: ResourceType
const: dataset
default: dataset
purpose:
type: string
enum:
- post-training/messages
- eval/question-answer
- eval/messages-answer
title: DatasetPurpose
description: >-
Purpose of the dataset. Each purpose has a required input data schema.
source:
$ref: '#/components/schemas/DataSource'
metadata:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- identifier
- provider_id
- type
- purpose
- source
- metadata
title: Dataset
RowsDataSource:
type: object
properties:
type:
type: string
const: rows
default: rows
rows:
type: array
items:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
description: >-
The dataset is stored in rows. E.g. - [ {"messages": [{"role": "user",
"content": "Hello, world!"}, {"role": "assistant", "content": "Hello,
world!"}]} ]
additionalProperties: false
required:
- type
- rows
title: RowsDataSource
description: A dataset stored in rows.
URIDataSource:
type: object
properties:
type:
type: string
const: uri
default: uri
uri:
type: string
description: >-
The dataset can be obtained from a URI. E.g. - "https://mywebsite.com/mydata.jsonl"
- "lsfs://mydata.jsonl" - "data:csv;base64,{base64_content}"
additionalProperties: false
required:
- type
- uri
title: URIDataSource
description: >-
A dataset that can be obtained from a URI.
FileResponse:
type: object
properties:
bucket:
type: string
description: >-
Bucket under which the file is stored (valid chars: a-zA-Z0-9_-)
key:
type: string
description: >-
Key under which the file is stored (valid chars: a-zA-Z0-9_-/.)
mime_type:
type: string
description: MIME type of the file
url:
type: string
description: Upload URL for the file contents
bytes:
type: integer
description: Size of the file in bytes
created_at:
type: integer
description: Timestamp of when the file was created
additionalProperties: false
required:
- bucket
- key
- mime_type
- url
- bytes
- created_at
title: FileResponse
description: Response representing a file entry.
Model:
type: object
properties:
identifier:
type: string
provider_resource_id:
type: string
provider_id:
type: string
type:
type: string
enum:
- model
- shield
- vector_db
- dataset
- scoring_function
- benchmark
- tool
- tool_group
title: ResourceType
const: model
default: model
metadata:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
model_type:
$ref: '#/components/schemas/ModelType'
default: llm
additionalProperties: false
required:
- identifier
- provider_id
- type
- metadata
- model_type
title: Model
ModelType:
type: string
enum:
- llm
- embedding
title: ModelType
AgentTurnInputType:
type: object
properties:
type:
type: string
const: agent_turn_input
default: agent_turn_input
additionalProperties: false
required:
- type
title: AgentTurnInputType
ArrayType:
type: object
properties:
type:
type: string
const: array
default: array
additionalProperties: false
required:
- type
title: ArrayType
BooleanType:
type: object
properties:
type:
type: string
const: boolean
default: boolean
additionalProperties: false
required:
- type
title: BooleanType
ChatCompletionInputType:
type: object
properties:
type:
type: string
const: chat_completion_input
default: chat_completion_input
additionalProperties: false
required:
- type
title: ChatCompletionInputType
CompletionInputType:
type: object
properties:
type:
type: string
const: completion_input
default: completion_input
additionalProperties: false
required:
- type
title: CompletionInputType
JsonType:
type: object
properties:
type:
type: string
const: json
default: json
additionalProperties: false
required:
- type
title: JsonType
NumberType:
type: object
properties:
type:
type: string
const: number
default: number
additionalProperties: false
required:
- type
title: NumberType
ObjectType:
type: object
properties:
type:
type: string
const: object
default: object
additionalProperties: false
required:
- type
title: ObjectType
ParamType:
oneOf:
- $ref: '#/components/schemas/StringType'
- $ref: '#/components/schemas/NumberType'
- $ref: '#/components/schemas/BooleanType'
- $ref: '#/components/schemas/ArrayType'
- $ref: '#/components/schemas/ObjectType'
- $ref: '#/components/schemas/JsonType'
- $ref: '#/components/schemas/UnionType'
- $ref: '#/components/schemas/ChatCompletionInputType'
- $ref: '#/components/schemas/CompletionInputType'
- $ref: '#/components/schemas/AgentTurnInputType'
discriminator:
propertyName: type
mapping:
string: '#/components/schemas/StringType'
number: '#/components/schemas/NumberType'
boolean: '#/components/schemas/BooleanType'
array: '#/components/schemas/ArrayType'
object: '#/components/schemas/ObjectType'
json: '#/components/schemas/JsonType'
union: '#/components/schemas/UnionType'
chat_completion_input: '#/components/schemas/ChatCompletionInputType'
completion_input: '#/components/schemas/CompletionInputType'
agent_turn_input: '#/components/schemas/AgentTurnInputType'
ScoringFn:
type: object
properties:
identifier:
type: string
provider_resource_id:
type: string
provider_id:
type: string
type:
type: string
enum:
- model
- shield
- vector_db
- dataset
- scoring_function
- benchmark
- tool
- tool_group
title: ResourceType
const: scoring_function
default: scoring_function
description:
type: string
metadata:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
return_type:
$ref: '#/components/schemas/ParamType'
params:
$ref: '#/components/schemas/ScoringFnParams'
additionalProperties: false
required:
- identifier
- provider_id
- type
- metadata
- return_type
title: ScoringFn
StringType:
type: object
properties:
type:
type: string
const: string
default: string
additionalProperties: false
required:
- type
title: StringType
UnionType:
type: object
properties:
type:
type: string
const: union
default: union
additionalProperties: false
required:
- type
title: UnionType
Shield:
type: object
properties:
identifier:
type: string
provider_resource_id:
type: string
provider_id:
type: string
type:
type: string
enum:
- model
- shield
- vector_db
- dataset
- scoring_function
- benchmark
- tool
- tool_group
title: ResourceType
const: shield
default: shield
params:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- identifier
- provider_id
- type
title: Shield
description: >-
A safety shield resource that can be used to check content
Span:
type: object
properties:
span_id:
type: string
trace_id:
type: string
parent_span_id:
type: string
name:
type: string
start_time:
type: string
format: date-time
end_time:
type: string
format: date-time
attributes:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- span_id
- trace_id
- name
- start_time
title: Span
GetSpanTreeRequest:
type: object
properties:
attributes_to_return:
type: array
items:
type: string
max_depth:
type: integer
additionalProperties: false
title: GetSpanTreeRequest
SpanStatus:
type: string
enum:
- ok
- error
title: SpanStatus
SpanWithStatus:
type: object
properties:
span_id:
type: string
trace_id:
type: string
parent_span_id:
type: string
name:
type: string
start_time:
type: string
format: date-time
end_time:
type: string
format: date-time
attributes:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
status:
$ref: '#/components/schemas/SpanStatus'
additionalProperties: false
required:
- span_id
- trace_id
- name
- start_time
title: SpanWithStatus
QuerySpanTreeResponse:
type: object
properties:
data:
type: object
additionalProperties:
$ref: '#/components/schemas/SpanWithStatus'
additionalProperties: false
required:
- data
title: QuerySpanTreeResponse
Tool:
type: object
properties:
identifier:
type: string
provider_resource_id:
type: string
provider_id:
type: string
type:
type: string
enum:
- model
- shield
- vector_db
- dataset
- scoring_function
- benchmark
- tool
- tool_group
title: ResourceType
const: tool
default: tool
toolgroup_id:
type: string
tool_host:
$ref: '#/components/schemas/ToolHost'
description:
type: string
parameters:
type: array
items:
$ref: '#/components/schemas/ToolParameter'
metadata:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- identifier
- provider_id
- type
- toolgroup_id
- tool_host
- description
- parameters
title: Tool
ToolHost:
type: string
enum:
- distribution
- client
- model_context_protocol
title: ToolHost
ToolGroup:
type: object
properties:
identifier:
type: string
provider_resource_id:
type: string
provider_id:
type: string
type:
type: string
enum:
- model
- shield
- vector_db
- dataset
- scoring_function
- benchmark
- tool
- tool_group
title: ResourceType
const: tool_group
default: tool_group
mcp_endpoint:
$ref: '#/components/schemas/URL'
args:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- identifier
- provider_id
- type
title: ToolGroup
Trace:
type: object
properties:
trace_id:
type: string
root_span_id:
type: string
start_time:
type: string
format: date-time
end_time:
type: string
format: date-time
additionalProperties: false
required:
- trace_id
- root_span_id
- start_time
title: Trace
Checkpoint:
description: Checkpoint created during training runs
title: Checkpoint
PostTrainingJobArtifactsResponse:
type: object
properties:
job_uuid:
type: string
checkpoints:
type: array
items:
$ref: '#/components/schemas/Checkpoint'
additionalProperties: false
required:
- job_uuid
- checkpoints
title: PostTrainingJobArtifactsResponse
description: Artifacts of a finetuning job.
PostTrainingJobStatusResponse:
type: object
properties:
job_uuid:
type: string
status:
type: string
enum:
- completed
- in_progress
- failed
- scheduled
- cancelled
title: JobStatus
scheduled_at:
type: string
format: date-time
started_at:
type: string
format: date-time
completed_at:
type: string
format: date-time
resources_allocated:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
checkpoints:
type: array
items:
$ref: '#/components/schemas/Checkpoint'
additionalProperties: false
required:
- job_uuid
- status
- checkpoints
title: PostTrainingJobStatusResponse
description: Status of a finetuning job.
ListPostTrainingJobsResponse:
type: object
properties:
data:
type: array
items:
type: object
properties:
job_uuid:
type: string
additionalProperties: false
required:
- job_uuid
title: PostTrainingJob
additionalProperties: false
required:
- data
title: ListPostTrainingJobsResponse
VectorDB:
type: object
properties:
identifier:
type: string
provider_resource_id:
type: string
provider_id:
type: string
type:
type: string
enum:
- model
- shield
- vector_db
- dataset
- scoring_function
- benchmark
- tool
- tool_group
title: ResourceType
const: vector_db
default: vector_db
embedding_model:
type: string
embedding_dimension:
type: integer
additionalProperties: false
required:
- identifier
- provider_id
- type
- embedding_model
- embedding_dimension
title: VectorDB
HealthInfo:
type: object
properties:
status:
type: string
enum:
- OK
- Error
- Not Implemented
title: HealthStatus
additionalProperties: false
required:
- status
title: HealthInfo
RAGDocument:
type: object
properties:
document_id:
type: string
description: The unique identifier for the document.
content:
oneOf:
- type: string
- $ref: '#/components/schemas/InterleavedContentItem'
- type: array
items:
$ref: '#/components/schemas/InterleavedContentItem'
- $ref: '#/components/schemas/URL'
description: The content of the document.
mime_type:
type: string
description: The MIME type of the document.
metadata:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
description: Additional metadata for the document.
additionalProperties: false
required:
- document_id
- content
- metadata
title: RAGDocument
description: >-
A document to be used for document ingestion in the RAG Tool.
InsertRequest:
type: object
properties:
documents:
type: array
items:
$ref: '#/components/schemas/RAGDocument'
vector_db_id:
type: string
chunk_size_in_tokens:
type: integer
additionalProperties: false
required:
- documents
- vector_db_id
- chunk_size_in_tokens
title: InsertRequest
InsertChunksRequest:
type: object
properties:
vector_db_id:
type: string
chunks:
type: array
items:
type: object
properties:
content:
$ref: '#/components/schemas/InterleavedContent'
metadata:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- content
- metadata
title: Chunk
ttl_seconds:
type: integer
additionalProperties: false
required:
- vector_db_id
- chunks
title: InsertChunksRequest
ProviderInfo:
type: object
properties:
api:
type: string
provider_id:
type: string
provider_type:
type: string
config:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
health:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- api
- provider_id
- provider_type
- config
- health
title: ProviderInfo
InvokeToolRequest:
type: object
properties:
tool_name:
type: string
kwargs:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- tool_name
- kwargs
title: InvokeToolRequest
ToolInvocationResult:
type: object
properties:
content:
$ref: '#/components/schemas/InterleavedContent'
error_message:
type: string
error_code:
type: integer
metadata:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
title: ToolInvocationResult
PaginatedResponse:
type: object
properties:
data:
type: array
items:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
description: The list of items for the current page
has_more:
type: boolean
description: >-
Whether there are more items available after this set
additionalProperties: false
required:
- data
- has_more
title: PaginatedResponse
description: >-
A generic paginated response that follows a simple format.
Job:
type: object
properties:
job_id:
type: string
status:
type: string
enum:
- completed
- in_progress
- failed
- scheduled
- cancelled
title: JobStatus
additionalProperties: false
required:
- job_id
- status
title: Job
BucketResponse:
type: object
properties:
name:
type: string
additionalProperties: false
required:
- name
title: BucketResponse
ListBucketResponse:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/BucketResponse'
description: List of FileResponse entries
additionalProperties: false
required:
- data
title: ListBucketResponse
description: >-
Response representing a list of file entries.
ListBenchmarksResponse:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/Benchmark'
additionalProperties: false
required:
- data
title: ListBenchmarksResponse
ListDatasetsResponse:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/Dataset'
additionalProperties: false
required:
- data
title: ListDatasetsResponse
ListFileResponse:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/FileResponse'
description: List of FileResponse entries
additionalProperties: false
required:
- data
title: ListFileResponse
description: >-
Response representing a list of file entries.
ListModelsResponse:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/Model'
additionalProperties: false
required:
- data
title: ListModelsResponse
ListProvidersResponse:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/ProviderInfo'
additionalProperties: false
required:
- data
title: ListProvidersResponse
RouteInfo:
type: object
properties:
route:
type: string
method:
type: string
provider_types:
type: array
items:
type: string
additionalProperties: false
required:
- route
- method
- provider_types
title: RouteInfo
ListRoutesResponse:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/RouteInfo'
additionalProperties: false
required:
- data
title: ListRoutesResponse
ListToolDefsResponse:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/ToolDef'
additionalProperties: false
required:
- data
title: ListToolDefsResponse
ListScoringFunctionsResponse:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/ScoringFn'
additionalProperties: false
required:
- data
title: ListScoringFunctionsResponse
ListShieldsResponse:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/Shield'
additionalProperties: false
required:
- data
title: ListShieldsResponse
ListToolGroupsResponse:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/ToolGroup'
additionalProperties: false
required:
- data
title: ListToolGroupsResponse
ListToolsResponse:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/Tool'
additionalProperties: false
required:
- data
title: ListToolsResponse
ListVectorDBsResponse:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/VectorDB'
additionalProperties: false
required:
- data
title: ListVectorDBsResponse
Event:
oneOf:
- $ref: '#/components/schemas/UnstructuredLogEvent'
- $ref: '#/components/schemas/MetricEvent'
- $ref: '#/components/schemas/StructuredLogEvent'
discriminator:
propertyName: type
mapping:
unstructured_log: '#/components/schemas/UnstructuredLogEvent'
metric: '#/components/schemas/MetricEvent'
structured_log: '#/components/schemas/StructuredLogEvent'
EventType:
type: string
enum:
- unstructured_log
- structured_log
- metric
title: EventType
LogSeverity:
type: string
enum:
- verbose
- debug
- info
- warn
- error
- critical
title: LogSeverity
MetricEvent:
type: object
properties:
trace_id:
type: string
span_id:
type: string
timestamp:
type: string
format: date-time
attributes:
type: object
additionalProperties:
oneOf:
- type: string
- type: integer
- type: number
- type: boolean
- type: 'null'
type:
$ref: '#/components/schemas/EventType'
const: metric
default: metric
metric:
type: string
value:
oneOf:
- type: integer
- type: number
unit:
type: string
additionalProperties: false
required:
- trace_id
- span_id
- timestamp
- type
- metric
- value
- unit
title: MetricEvent
SpanEndPayload:
type: object
properties:
type:
$ref: '#/components/schemas/StructuredLogType'
const: span_end
default: span_end
status:
$ref: '#/components/schemas/SpanStatus'
additionalProperties: false
required:
- type
- status
title: SpanEndPayload
SpanStartPayload:
type: object
properties:
type:
$ref: '#/components/schemas/StructuredLogType'
const: span_start
default: span_start
name:
type: string
parent_span_id:
type: string
additionalProperties: false
required:
- type
- name
title: SpanStartPayload
StructuredLogEvent:
type: object
properties:
trace_id:
type: string
span_id:
type: string
timestamp:
type: string
format: date-time
attributes:
type: object
additionalProperties:
oneOf:
- type: string
- type: integer
- type: number
- type: boolean
- type: 'null'
type:
$ref: '#/components/schemas/EventType'
const: structured_log
default: structured_log
payload:
$ref: '#/components/schemas/StructuredLogPayload'
additionalProperties: false
required:
- trace_id
- span_id
- timestamp
- type
- payload
title: StructuredLogEvent
StructuredLogPayload:
oneOf:
- $ref: '#/components/schemas/SpanStartPayload'
- $ref: '#/components/schemas/SpanEndPayload'
discriminator:
propertyName: type
mapping:
span_start: '#/components/schemas/SpanStartPayload'
span_end: '#/components/schemas/SpanEndPayload'
StructuredLogType:
type: string
enum:
- span_start
- span_end
title: StructuredLogType
UnstructuredLogEvent:
type: object
properties:
trace_id:
type: string
span_id:
type: string
timestamp:
type: string
format: date-time
attributes:
type: object
additionalProperties:
oneOf:
- type: string
- type: integer
- type: number
- type: boolean
- type: 'null'
type:
$ref: '#/components/schemas/EventType'
const: unstructured_log
default: unstructured_log
message:
type: string
severity:
$ref: '#/components/schemas/LogSeverity'
additionalProperties: false
required:
- trace_id
- span_id
- timestamp
- type
- message
- severity
title: UnstructuredLogEvent
LogEventRequest:
type: object
properties:
event:
$ref: '#/components/schemas/Event'
ttl_seconds:
type: integer
additionalProperties: false
required:
- event
- ttl_seconds
title: LogEventRequest
OpenAIAssistantMessageParam:
type: object
properties:
role:
type: string
const: assistant
default: assistant
description: >-
Must be "assistant" to identify this as the model's response
content:
oneOf:
- type: string
- type: array
items:
$ref: '#/components/schemas/OpenAIChatCompletionContentPartParam'
description: The content of the model's response
name:
type: string
description: >-
(Optional) The name of the assistant message participant.
tool_calls:
type: array
items:
$ref: '#/components/schemas/OpenAIChatCompletionToolCall'
description: >-
List of tool calls. Each tool call is an OpenAIChatCompletionToolCall
object.
additionalProperties: false
required:
- role
title: OpenAIAssistantMessageParam
description: >-
A message containing the model's (assistant) response in an OpenAI-compatible
chat completion request.
"OpenAIChatCompletionContentPartImageParam":
type: object
properties:
type:
type: string
const: image_url
default: image_url
image_url:
$ref: '#/components/schemas/OpenAIImageURL'
additionalProperties: false
required:
- type
- image_url
title: >-
OpenAIChatCompletionContentPartImageParam
OpenAIChatCompletionContentPartParam:
oneOf:
- $ref: '#/components/schemas/OpenAIChatCompletionContentPartTextParam'
- $ref: '#/components/schemas/OpenAIChatCompletionContentPartImageParam'
discriminator:
propertyName: type
mapping:
text: '#/components/schemas/OpenAIChatCompletionContentPartTextParam'
image_url: '#/components/schemas/OpenAIChatCompletionContentPartImageParam'
OpenAIChatCompletionContentPartTextParam:
type: object
properties:
type:
type: string
const: text
default: text
text:
type: string
additionalProperties: false
required:
- type
- text
title: OpenAIChatCompletionContentPartTextParam
OpenAIChatCompletionToolCall:
type: object
properties:
index:
type: integer
id:
type: string
type:
type: string
const: function
default: function
function:
$ref: '#/components/schemas/OpenAIChatCompletionToolCallFunction'
additionalProperties: false
required:
- type
title: OpenAIChatCompletionToolCall
OpenAIChatCompletionToolCallFunction:
type: object
properties:
name:
type: string
arguments:
type: string
additionalProperties: false
title: OpenAIChatCompletionToolCallFunction
OpenAIDeveloperMessageParam:
type: object
properties:
role:
type: string
const: developer
default: developer
description: >-
Must be "developer" to identify this as a developer message
content:
oneOf:
- type: string
- type: array
items:
$ref: '#/components/schemas/OpenAIChatCompletionContentPartParam'
description: The content of the developer message
name:
type: string
description: >-
(Optional) The name of the developer message participant.
additionalProperties: false
required:
- role
- content
title: OpenAIDeveloperMessageParam
description: >-
A message from the developer in an OpenAI-compatible chat completion request.
OpenAIImageURL:
type: object
properties:
url:
type: string
detail:
type: string
additionalProperties: false
required:
- url
title: OpenAIImageURL
OpenAIJSONSchema:
type: object
properties:
name:
type: string
description:
type: string
strict:
type: boolean
schema:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- name
title: OpenAIJSONSchema
OpenAIMessageParam:
oneOf:
- $ref: '#/components/schemas/OpenAIUserMessageParam'
- $ref: '#/components/schemas/OpenAISystemMessageParam'
- $ref: '#/components/schemas/OpenAIAssistantMessageParam'
- $ref: '#/components/schemas/OpenAIToolMessageParam'
- $ref: '#/components/schemas/OpenAIDeveloperMessageParam'
discriminator:
propertyName: role
mapping:
user: '#/components/schemas/OpenAIUserMessageParam'
system: '#/components/schemas/OpenAISystemMessageParam'
assistant: '#/components/schemas/OpenAIAssistantMessageParam'
tool: '#/components/schemas/OpenAIToolMessageParam'
developer: '#/components/schemas/OpenAIDeveloperMessageParam'
OpenAIResponseFormatJSONObject:
type: object
properties:
type:
type: string
const: json_object
default: json_object
additionalProperties: false
required:
- type
title: OpenAIResponseFormatJSONObject
OpenAIResponseFormatJSONSchema:
type: object
properties:
type:
type: string
const: json_schema
default: json_schema
json_schema:
$ref: '#/components/schemas/OpenAIJSONSchema'
additionalProperties: false
required:
- type
- json_schema
title: OpenAIResponseFormatJSONSchema
OpenAIResponseFormatParam:
oneOf:
- $ref: '#/components/schemas/OpenAIResponseFormatText'
- $ref: '#/components/schemas/OpenAIResponseFormatJSONSchema'
- $ref: '#/components/schemas/OpenAIResponseFormatJSONObject'
discriminator:
propertyName: type
mapping:
text: '#/components/schemas/OpenAIResponseFormatText'
json_schema: '#/components/schemas/OpenAIResponseFormatJSONSchema'
json_object: '#/components/schemas/OpenAIResponseFormatJSONObject'
OpenAIResponseFormatText:
type: object
properties:
type:
type: string
const: text
default: text
additionalProperties: false
required:
- type
title: OpenAIResponseFormatText
OpenAISystemMessageParam:
type: object
properties:
role:
type: string
const: system
default: system
description: >-
Must be "system" to identify this as a system message
content:
oneOf:
- type: string
- type: array
items:
$ref: '#/components/schemas/OpenAIChatCompletionContentPartParam'
description: >-
The content of the "system prompt". If multiple system messages are provided,
they are concatenated. The underlying Llama Stack code may also add other
system messages (for example, for formatting tool definitions).
name:
type: string
description: >-
(Optional) The name of the system message participant.
additionalProperties: false
required:
- role
- content
title: OpenAISystemMessageParam
description: >-
A system message providing instructions or context to the model.
OpenAIToolMessageParam:
type: object
properties:
role:
type: string
const: tool
default: tool
description: >-
Must be "tool" to identify this as a tool response
tool_call_id:
type: string
description: >-
Unique identifier for the tool call this response is for
content:
oneOf:
- type: string
- type: array
items:
$ref: '#/components/schemas/OpenAIChatCompletionContentPartParam'
description: The response content from the tool
additionalProperties: false
required:
- role
- tool_call_id
- content
title: OpenAIToolMessageParam
description: >-
A message representing the result of a tool invocation in an OpenAI-compatible
chat completion request.
OpenAIUserMessageParam:
type: object
properties:
role:
type: string
const: user
default: user
description: >-
Must be "user" to identify this as a user message
content:
oneOf:
- type: string
- type: array
items:
$ref: '#/components/schemas/OpenAIChatCompletionContentPartParam'
description: >-
The content of the message, which can include text and other media
name:
type: string
description: >-
(Optional) The name of the user message participant.
additionalProperties: false
required:
- role
- content
title: OpenAIUserMessageParam
description: >-
A message from the user in an OpenAI-compatible chat completion request.
OpenaiChatCompletionRequest:
type: object
properties:
model:
type: string
description: >-
The identifier of the model to use. The model must be registered with
Llama Stack and available via the /models endpoint.
messages:
type: array
items:
$ref: '#/components/schemas/OpenAIMessageParam'
description: List of messages in the conversation
frequency_penalty:
type: number
description: >-
(Optional) The penalty for repeated tokens
function_call:
oneOf:
- type: string
- type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
description: (Optional) The function call to use
functions:
type: array
items:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
description: (Optional) List of functions to use
logit_bias:
type: object
additionalProperties:
type: number
description: (Optional) The logit bias to use
logprobs:
type: boolean
description: (Optional) The log probabilities to use
max_completion_tokens:
type: integer
description: >-
(Optional) The maximum number of tokens to generate
max_tokens:
type: integer
description: >-
(Optional) The maximum number of tokens to generate
n:
type: integer
description: >-
(Optional) The number of completions to generate
parallel_tool_calls:
type: boolean
description: >-
(Optional) Whether to parallelize tool calls
presence_penalty:
type: number
description: >-
(Optional) The penalty for repeated tokens
response_format:
$ref: '#/components/schemas/OpenAIResponseFormatParam'
description: (Optional) The response format to use
seed:
type: integer
description: (Optional) The seed to use
stop:
oneOf:
- type: string
- type: array
items:
type: string
description: (Optional) The stop tokens to use
stream:
type: boolean
description: >-
(Optional) Whether to stream the response
stream_options:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
description: (Optional) The stream options to use
temperature:
type: number
description: (Optional) The temperature to use
tool_choice:
oneOf:
- type: string
- type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
description: (Optional) The tool choice to use
tools:
type: array
items:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
description: (Optional) The tools to use
top_logprobs:
type: integer
description: >-
(Optional) The top log probabilities to use
top_p:
type: number
description: (Optional) The top p to use
user:
type: string
description: (Optional) The user to use
additionalProperties: false
required:
- model
- messages
title: OpenaiChatCompletionRequest
OpenAIChatCompletion:
type: object
properties:
id:
type: string
description: The ID of the chat completion
choices:
type: array
items:
$ref: '#/components/schemas/OpenAIChoice'
description: List of choices
object:
type: string
const: chat.completion
default: chat.completion
description: >-
The object type, which will be "chat.completion"
created:
type: integer
description: >-
The Unix timestamp in seconds when the chat completion was created
model:
type: string
description: >-
The model that was used to generate the chat completion
additionalProperties: false
required:
- id
- choices
- object
- created
- model
title: OpenAIChatCompletion
description: >-
Response from an OpenAI-compatible chat completion request.
OpenAIChatCompletionChunk:
type: object
properties:
id:
type: string
description: The ID of the chat completion
choices:
type: array
items:
$ref: '#/components/schemas/OpenAIChunkChoice'
description: List of choices
object:
type: string
const: chat.completion.chunk
default: chat.completion.chunk
description: >-
The object type, which will be "chat.completion.chunk"
created:
type: integer
description: >-
The Unix timestamp in seconds when the chat completion was created
model:
type: string
description: >-
The model that was used to generate the chat completion
additionalProperties: false
required:
- id
- choices
- object
- created
- model
title: OpenAIChatCompletionChunk
description: >-
Chunk from a streaming response to an OpenAI-compatible chat completion request.
OpenAIChoice:
type: object
properties:
message:
$ref: '#/components/schemas/OpenAIMessageParam'
description: The message from the model
finish_reason:
type: string
description: The reason the model stopped generating
index:
type: integer
description: The index of the choice
logprobs:
$ref: '#/components/schemas/OpenAIChoiceLogprobs'
description: >-
(Optional) The log probabilities for the tokens in the message
additionalProperties: false
required:
- message
- finish_reason
- index
title: OpenAIChoice
description: >-
A choice from an OpenAI-compatible chat completion response.
OpenAIChoiceDelta:
type: object
properties:
content:
type: string
description: (Optional) The content of the delta
refusal:
type: string
description: (Optional) The refusal of the delta
role:
type: string
description: (Optional) The role of the delta
tool_calls:
type: array
items:
$ref: '#/components/schemas/OpenAIChatCompletionToolCall'
description: (Optional) The tool calls of the delta
additionalProperties: false
title: OpenAIChoiceDelta
description: >-
A delta from an OpenAI-compatible chat completion streaming response.
OpenAIChoiceLogprobs:
type: object
properties:
content:
type: array
items:
$ref: '#/components/schemas/OpenAITokenLogProb'
description: >-
(Optional) The log probabilities for the tokens in the message
refusal:
type: array
items:
$ref: '#/components/schemas/OpenAITokenLogProb'
description: >-
(Optional) The log probabilities for the tokens in the message
additionalProperties: false
title: OpenAIChoiceLogprobs
description: >-
The log probabilities for the tokens in the message from an OpenAI-compatible
chat completion response.
OpenAIChunkChoice:
type: object
properties:
delta:
$ref: '#/components/schemas/OpenAIChoiceDelta'
description: The delta from the chunk
finish_reason:
type: string
description: The reason the model stopped generating
index:
type: integer
description: The index of the choice
logprobs:
$ref: '#/components/schemas/OpenAIChoiceLogprobs'
description: >-
(Optional) The log probabilities for the tokens in the message
additionalProperties: false
required:
- delta
- finish_reason
- index
title: OpenAIChunkChoice
description: >-
A chunk choice from an OpenAI-compatible chat completion streaming response.
OpenAITokenLogProb:
type: object
properties:
token:
type: string
bytes:
type: array
items:
type: integer
logprob:
type: number
top_logprobs:
type: array
items:
$ref: '#/components/schemas/OpenAITopLogProb'
additionalProperties: false
required:
- token
- logprob
- top_logprobs
title: OpenAITokenLogProb
description: >-
The log probability for a token from an OpenAI-compatible chat completion
response.
OpenAITopLogProb:
type: object
properties:
token:
type: string
bytes:
type: array
items:
type: integer
logprob:
type: number
additionalProperties: false
required:
- token
- logprob
title: OpenAITopLogProb
description: >-
The top log probability for a token from an OpenAI-compatible chat completion
response.
OpenaiCompletionRequest:
type: object
properties:
model:
type: string
description: >-
The identifier of the model to use. The model must be registered with
Llama Stack and available via the /models endpoint.
prompt:
oneOf:
- type: string
- type: array
items:
type: string
- type: array
items:
type: integer
- type: array
items:
type: array
items:
type: integer
description: The prompt to generate a completion for
best_of:
type: integer
description: >-
(Optional) The number of completions to generate
echo:
type: boolean
description: (Optional) Whether to echo the prompt
frequency_penalty:
type: number
description: >-
(Optional) The penalty for repeated tokens
logit_bias:
type: object
additionalProperties:
type: number
description: (Optional) The logit bias to use
logprobs:
type: boolean
description: (Optional) The log probabilities to use
max_tokens:
type: integer
description: >-
(Optional) The maximum number of tokens to generate
n:
type: integer
description: >-
(Optional) The number of completions to generate
presence_penalty:
type: number
description: >-
(Optional) The penalty for repeated tokens
seed:
type: integer
description: (Optional) The seed to use
stop:
oneOf:
- type: string
- type: array
items:
type: string
description: (Optional) The stop tokens to use
stream:
type: boolean
description: >-
(Optional) Whether to stream the response
stream_options:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
description: (Optional) The stream options to use
temperature:
type: number
description: (Optional) The temperature to use
top_p:
type: number
description: (Optional) The top p to use
user:
type: string
description: (Optional) The user to use
guided_choice:
type: array
items:
type: string
prompt_logprobs:
type: integer
additionalProperties: false
required:
- model
- prompt
title: OpenaiCompletionRequest
OpenAICompletion:
type: object
properties:
id:
type: string
choices:
type: array
items:
$ref: '#/components/schemas/OpenAICompletionChoice'
created:
type: integer
model:
type: string
object:
type: string
const: text_completion
default: text_completion
additionalProperties: false
required:
- id
- choices
- created
- model
- object
title: OpenAICompletion
description: >-
Response from an OpenAI-compatible completion request.
OpenAICompletionChoice:
type: object
properties:
finish_reason:
type: string
text:
type: string
index:
type: integer
logprobs:
$ref: '#/components/schemas/OpenAIChoiceLogprobs'
additionalProperties: false
required:
- finish_reason
- text
- index
title: OpenAICompletionChoice
description: >-
A choice from an OpenAI-compatible completion response.
OpenAIModel:
type: object
properties:
id:
type: string
object:
type: string
const: model
default: model
created:
type: integer
owned_by:
type: string
additionalProperties: false
required:
- id
- object
- created
- owned_by
title: OpenAIModel
description: A model from OpenAI.
OpenAIListModelsResponse:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/OpenAIModel'
additionalProperties: false
required:
- data
title: OpenAIListModelsResponse
DPOAlignmentConfig:
type: object
properties:
reward_scale:
type: number
reward_clip:
type: number
epsilon:
type: number
gamma:
type: number
additionalProperties: false
required:
- reward_scale
- reward_clip
- epsilon
- gamma
title: DPOAlignmentConfig
DataConfig:
type: object
properties:
dataset_id:
type: string
batch_size:
type: integer
shuffle:
type: boolean
data_format:
$ref: '#/components/schemas/DatasetFormat'
validation_dataset_id:
type: string
packed:
type: boolean
default: false
train_on_input:
type: boolean
default: false
additionalProperties: false
required:
- dataset_id
- batch_size
- shuffle
- data_format
title: DataConfig
DatasetFormat:
type: string
enum:
- instruct
- dialog
title: DatasetFormat
EfficiencyConfig:
type: object
properties:
enable_activation_checkpointing:
type: boolean
default: false
enable_activation_offloading:
type: boolean
default: false
memory_efficient_fsdp_wrap:
type: boolean
default: false
fsdp_cpu_offload:
type: boolean
default: false
additionalProperties: false
title: EfficiencyConfig
OptimizerConfig:
type: object
properties:
optimizer_type:
$ref: '#/components/schemas/OptimizerType'
lr:
type: number
weight_decay:
type: number
num_warmup_steps:
type: integer
additionalProperties: false
required:
- optimizer_type
- lr
- weight_decay
- num_warmup_steps
title: OptimizerConfig
OptimizerType:
type: string
enum:
- adam
- adamw
- sgd
title: OptimizerType
TrainingConfig:
type: object
properties:
n_epochs:
type: integer
max_steps_per_epoch:
type: integer
default: 1
gradient_accumulation_steps:
type: integer
default: 1
max_validation_steps:
type: integer
default: 1
data_config:
$ref: '#/components/schemas/DataConfig'
optimizer_config:
$ref: '#/components/schemas/OptimizerConfig'
efficiency_config:
$ref: '#/components/schemas/EfficiencyConfig'
dtype:
type: string
default: bf16
additionalProperties: false
required:
- n_epochs
- max_steps_per_epoch
- gradient_accumulation_steps
title: TrainingConfig
PreferenceOptimizeRequest:
type: object
properties:
job_uuid:
type: string
finetuned_model:
type: string
algorithm_config:
$ref: '#/components/schemas/DPOAlignmentConfig'
training_config:
$ref: '#/components/schemas/TrainingConfig'
hyperparam_search_config:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
logger_config:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- job_uuid
- finetuned_model
- algorithm_config
- training_config
- hyperparam_search_config
- logger_config
title: PreferenceOptimizeRequest
PostTrainingJob:
type: object
properties:
job_uuid:
type: string
additionalProperties: false
required:
- job_uuid
title: PostTrainingJob
DefaultRAGQueryGeneratorConfig:
type: object
properties:
type:
type: string
const: default
default: default
separator:
type: string
default: ' '
additionalProperties: false
required:
- type
- separator
title: DefaultRAGQueryGeneratorConfig
LLMRAGQueryGeneratorConfig:
type: object
properties:
type:
type: string
const: llm
default: llm
model:
type: string
template:
type: string
additionalProperties: false
required:
- type
- model
- template
title: LLMRAGQueryGeneratorConfig
RAGQueryConfig:
type: object
properties:
query_generator_config:
$ref: '#/components/schemas/RAGQueryGeneratorConfig'
description: Configuration for the query generator.
max_tokens_in_context:
type: integer
default: 4096
description: Maximum number of tokens in the context.
max_chunks:
type: integer
default: 5
description: Maximum number of chunks to retrieve.
chunk_template:
type: string
default: >
Result {index}
Content: {chunk.content}
Metadata: {metadata}
description: >-
Template for formatting each retrieved chunk in the context. Available
placeholders: {index} (1-based chunk ordinal), {chunk.content} (chunk
content string), {metadata} (chunk metadata dict). Default: "Result {index}\nContent:
{chunk.content}\nMetadata: {metadata}\n"
additionalProperties: false
required:
- query_generator_config
- max_tokens_in_context
- max_chunks
- chunk_template
title: RAGQueryConfig
description: >-
Configuration for the RAG query generation.
RAGQueryGeneratorConfig:
oneOf:
- $ref: '#/components/schemas/DefaultRAGQueryGeneratorConfig'
- $ref: '#/components/schemas/LLMRAGQueryGeneratorConfig'
discriminator:
propertyName: type
mapping:
default: '#/components/schemas/DefaultRAGQueryGeneratorConfig'
llm: '#/components/schemas/LLMRAGQueryGeneratorConfig'
QueryRequest:
type: object
properties:
content:
$ref: '#/components/schemas/InterleavedContent'
vector_db_ids:
type: array
items:
type: string
query_config:
$ref: '#/components/schemas/RAGQueryConfig'
additionalProperties: false
required:
- content
- vector_db_ids
title: QueryRequest
RAGQueryResult:
type: object
properties:
content:
$ref: '#/components/schemas/InterleavedContent'
metadata:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- metadata
title: RAGQueryResult
QueryChunksRequest:
type: object
properties:
vector_db_id:
type: string
query:
$ref: '#/components/schemas/InterleavedContent'
params:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- vector_db_id
- query
title: QueryChunksRequest
QueryChunksResponse:
type: object
properties:
chunks:
type: array
items:
type: object
properties:
content:
$ref: '#/components/schemas/InterleavedContent'
metadata:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- content
- metadata
title: Chunk
scores:
type: array
items:
type: number
additionalProperties: false
required:
- chunks
- scores
title: QueryChunksResponse
QueryMetricsRequest:
type: object
properties:
start_time:
type: integer
end_time:
type: integer
granularity:
type: string
query_type:
type: string
enum:
- range
- instant
title: MetricQueryType
label_matchers:
type: array
items:
type: object
properties:
name:
type: string
value:
type: string
operator:
type: string
enum:
- '='
- '!='
- =~
- '!~'
title: MetricLabelOperator
default: '='
additionalProperties: false
required:
- name
- value
- operator
title: MetricLabelMatcher
additionalProperties: false
required:
- start_time
- query_type
title: QueryMetricsRequest
MetricDataPoint:
type: object
properties:
timestamp:
type: integer
value:
type: number
additionalProperties: false
required:
- timestamp
- value
title: MetricDataPoint
MetricLabel:
type: object
properties:
name:
type: string
value:
type: string
additionalProperties: false
required:
- name
- value
title: MetricLabel
MetricSeries:
type: object
properties:
metric:
type: string
labels:
type: array
items:
$ref: '#/components/schemas/MetricLabel'
values:
type: array
items:
$ref: '#/components/schemas/MetricDataPoint'
additionalProperties: false
required:
- metric
- labels
- values
title: MetricSeries
QueryMetricsResponse:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/MetricSeries'
additionalProperties: false
required:
- data
title: QueryMetricsResponse
QueryCondition:
type: object
properties:
key:
type: string
op:
$ref: '#/components/schemas/QueryConditionOp'
value:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- key
- op
- value
title: QueryCondition
QueryConditionOp:
type: string
enum:
- eq
- ne
- gt
- lt
title: QueryConditionOp
QuerySpansRequest:
type: object
properties:
attribute_filters:
type: array
items:
$ref: '#/components/schemas/QueryCondition'
attributes_to_return:
type: array
items:
type: string
max_depth:
type: integer
additionalProperties: false
required:
- attribute_filters
- attributes_to_return
title: QuerySpansRequest
QuerySpansResponse:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/Span'
additionalProperties: false
required:
- data
title: QuerySpansResponse
QueryTracesRequest:
type: object
properties:
attribute_filters:
type: array
items:
$ref: '#/components/schemas/QueryCondition'
limit:
type: integer
offset:
type: integer
order_by:
type: array
items:
type: string
additionalProperties: false
title: QueryTracesRequest
QueryTracesResponse:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/Trace'
additionalProperties: false
required:
- data
title: QueryTracesResponse
RegisterBenchmarkRequest:
type: object
properties:
benchmark_id:
type: string
dataset_id:
type: string
scoring_functions:
type: array
items:
type: string
provider_benchmark_id:
type: string
provider_id:
type: string
metadata:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- benchmark_id
- dataset_id
- scoring_functions
title: RegisterBenchmarkRequest
RegisterDatasetRequest:
type: object
properties:
purpose:
type: string
enum:
- post-training/messages
- eval/question-answer
- eval/messages-answer
description: >-
The purpose of the dataset. One of - "post-training/messages": The dataset
contains a messages column with list of messages for post-training. {
"messages": [ {"role": "user", "content": "Hello, world!"}, {"role": "assistant",
"content": "Hello, world!"}, ] } - "eval/question-answer": The dataset
contains a question column and an answer column for evaluation. { "question":
"What is the capital of France?", "answer": "Paris" } - "eval/messages-answer":
The dataset contains a messages column with list of messages and an answer
column for evaluation. { "messages": [ {"role": "user", "content": "Hello,
my name is John Doe."}, {"role": "assistant", "content": "Hello, John
Doe. How can I help you today?"}, {"role": "user", "content": "What's
my name?"}, ], "answer": "John Doe" }
source:
$ref: '#/components/schemas/DataSource'
description: >-
The data source of the dataset. Ensure that the data source schema is
compatible with the purpose of the dataset. Examples: - { "type": "uri",
"uri": "https://mywebsite.com/mydata.jsonl" } - { "type": "uri", "uri":
"lsfs://mydata.jsonl" } - { "type": "uri", "uri": "data:csv;base64,{base64_content}"
} - { "type": "uri", "uri": "huggingface://llamastack/simpleqa?split=train"
} - { "type": "rows", "rows": [ { "messages": [ {"role": "user", "content":
"Hello, world!"}, {"role": "assistant", "content": "Hello, world!"}, ]
} ] }
metadata:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
description: >-
The metadata for the dataset. - E.g. {"description": "My dataset"}
dataset_id:
type: string
description: >-
The ID of the dataset. If not provided, an ID will be generated.
additionalProperties: false
required:
- purpose
- source
title: RegisterDatasetRequest
RegisterModelRequest:
type: object
properties:
model_id:
type: string
provider_model_id:
type: string
provider_id:
type: string
metadata:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
model_type:
$ref: '#/components/schemas/ModelType'
additionalProperties: false
required:
- model_id
title: RegisterModelRequest
RegisterScoringFunctionRequest:
type: object
properties:
scoring_fn_id:
type: string
description:
type: string
return_type:
$ref: '#/components/schemas/ParamType'
provider_scoring_fn_id:
type: string
provider_id:
type: string
params:
$ref: '#/components/schemas/ScoringFnParams'
additionalProperties: false
required:
- scoring_fn_id
- description
- return_type
title: RegisterScoringFunctionRequest
RegisterShieldRequest:
type: object
properties:
shield_id:
type: string
provider_shield_id:
type: string
provider_id:
type: string
params:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- shield_id
title: RegisterShieldRequest
RegisterToolGroupRequest:
type: object
properties:
toolgroup_id:
type: string
provider_id:
type: string
mcp_endpoint:
$ref: '#/components/schemas/URL'
args:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- toolgroup_id
- provider_id
title: RegisterToolGroupRequest
RegisterVectorDbRequest:
type: object
properties:
vector_db_id:
type: string
embedding_model:
type: string
embedding_dimension:
type: integer
provider_id:
type: string
provider_vector_db_id:
type: string
additionalProperties: false
required:
- vector_db_id
- embedding_model
title: RegisterVectorDbRequest
ResumeAgentTurnRequest:
type: object
properties:
tool_responses:
type: array
items:
$ref: '#/components/schemas/ToolResponse'
description: >-
The tool call responses to resume the turn with.
stream:
type: boolean
description: Whether to stream the response.
additionalProperties: false
required:
- tool_responses
title: ResumeAgentTurnRequest
RunEvalRequest:
type: object
properties:
benchmark_config:
$ref: '#/components/schemas/BenchmarkConfig'
description: The configuration for the benchmark.
additionalProperties: false
required:
- benchmark_config
title: RunEvalRequest
RunShieldRequest:
type: object
properties:
shield_id:
type: string
messages:
type: array
items:
$ref: '#/components/schemas/Message'
params:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- shield_id
- messages
- params
title: RunShieldRequest
RunShieldResponse:
type: object
properties:
violation:
$ref: '#/components/schemas/SafetyViolation'
additionalProperties: false
title: RunShieldResponse
SaveSpansToDatasetRequest:
type: object
properties:
attribute_filters:
type: array
items:
$ref: '#/components/schemas/QueryCondition'
attributes_to_save:
type: array
items:
type: string
dataset_id:
type: string
max_depth:
type: integer
additionalProperties: false
required:
- attribute_filters
- attributes_to_save
- dataset_id
title: SaveSpansToDatasetRequest
ScoreRequest:
type: object
properties:
input_rows:
type: array
items:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
description: The rows to score.
scoring_functions:
type: object
additionalProperties:
oneOf:
- $ref: '#/components/schemas/ScoringFnParams'
- type: 'null'
description: >-
The scoring functions to use for the scoring.
additionalProperties: false
required:
- input_rows
- scoring_functions
title: ScoreRequest
ScoreResponse:
type: object
properties:
results:
type: object
additionalProperties:
$ref: '#/components/schemas/ScoringResult'
description: >-
A map of scoring function name to ScoringResult.
additionalProperties: false
required:
- results
title: ScoreResponse
description: The response from scoring.
ScoreBatchRequest:
type: object
properties:
dataset_id:
type: string
scoring_functions:
type: object
additionalProperties:
oneOf:
- $ref: '#/components/schemas/ScoringFnParams'
- type: 'null'
save_results_dataset:
type: boolean
additionalProperties: false
required:
- dataset_id
- scoring_functions
- save_results_dataset
title: ScoreBatchRequest
ScoreBatchResponse:
type: object
properties:
dataset_id:
type: string
results:
type: object
additionalProperties:
$ref: '#/components/schemas/ScoringResult'
additionalProperties: false
required:
- results
title: ScoreBatchResponse
AlgorithmConfig:
oneOf:
- $ref: '#/components/schemas/LoraFinetuningConfig'
- $ref: '#/components/schemas/QATFinetuningConfig'
discriminator:
propertyName: type
mapping:
LoRA: '#/components/schemas/LoraFinetuningConfig'
QAT: '#/components/schemas/QATFinetuningConfig'
LoraFinetuningConfig:
type: object
properties:
type:
type: string
const: LoRA
default: LoRA
lora_attn_modules:
type: array
items:
type: string
apply_lora_to_mlp:
type: boolean
apply_lora_to_output:
type: boolean
rank:
type: integer
alpha:
type: integer
use_dora:
type: boolean
default: false
quantize_base:
type: boolean
default: false
additionalProperties: false
required:
- type
- lora_attn_modules
- apply_lora_to_mlp
- apply_lora_to_output
- rank
- alpha
title: LoraFinetuningConfig
QATFinetuningConfig:
type: object
properties:
type:
type: string
const: QAT
default: QAT
quantizer_name:
type: string
group_size:
type: integer
additionalProperties: false
required:
- type
- quantizer_name
- group_size
title: QATFinetuningConfig
SupervisedFineTuneRequest:
type: object
properties:
job_uuid:
type: string
training_config:
$ref: '#/components/schemas/TrainingConfig'
hyperparam_search_config:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
logger_config:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
model:
type: string
checkpoint_dir:
type: string
algorithm_config:
$ref: '#/components/schemas/AlgorithmConfig'
additionalProperties: false
required:
- job_uuid
- training_config
- hyperparam_search_config
- logger_config
title: SupervisedFineTuneRequest
SyntheticDataGenerateRequest:
type: object
properties:
dialogs:
type: array
items:
$ref: '#/components/schemas/Message'
filtering_function:
type: string
enum:
- none
- random
- top_k
- top_p
- top_k_top_p
- sigmoid
title: FilteringFunction
description: The type of filtering function.
model:
type: string
additionalProperties: false
required:
- dialogs
- filtering_function
title: SyntheticDataGenerateRequest
SyntheticDataGenerationResponse:
type: object
properties:
synthetic_data:
type: array
items:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
statistics:
type: object
additionalProperties:
oneOf:
- type: 'null'
- type: boolean
- type: number
- type: string
- type: array
- type: object
additionalProperties: false
required:
- synthetic_data
title: SyntheticDataGenerationResponse
description: >-
Response from the synthetic data generation. Batch of (prompt, response, score)
tuples that pass the threshold.
VersionInfo:
type: object
properties:
version:
type: string
additionalProperties: false
required:
- version
title: VersionInfo
responses:
BadRequest400:
description: The request was invalid or malformed
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
example:
status: 400
title: Bad Request
detail: The request was invalid or malformed
TooManyRequests429:
description: >-
The client has sent too many requests in a given amount of time
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
example:
status: 429
title: Too Many Requests
detail: >-
You have exceeded the rate limit. Please try again later.
InternalServerError500:
description: >-
The server encountered an unexpected error
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
example:
status: 500
title: Internal Server Error
detail: >-
An unexpected error occurred. Our team has been notified.
DefaultError:
description: An unexpected error occurred
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
example:
status: 0
title: Error
detail: An unexpected error occurred
security:
- Default: []
tags:
- name: Agents
description: >-
Main functionalities provided by this API:
- Create agents with specific instructions and ability to use tools.
- Interactions with agents are grouped into sessions ("threads"), and each interaction
is called a "turn".
- Agents can be provided with various tools (see the ToolGroups and ToolRuntime
APIs for more details).
- Agents can be provided with various shields (see the Safety API for more details).
- Agents can also use Memory to retrieve information from knowledge bases. See
the RAG Tool and Vector IO APIs for more details.
x-displayName: >-
Agents API for creating and interacting with agentic systems.
- name: BatchInference (Coming Soon)
description: >-
This is an asynchronous API. If the request is successful, the response will
be a job which can be polled for completion.
NOTE: This API is not yet implemented and is subject to change in concert with
other asynchronous APIs
including (post-training, evals, etc).
x-displayName: >-
Batch inference API for generating completions and chat completions.
- name: Benchmarks
- name: DatasetIO
- name: Datasets
- name: Eval
x-displayName: >-
Llama Stack Evaluation API for running evaluations on model and agent candidates.
- name: Files
- name: Inference
description: >-
This API provides the raw interface to the underlying models. Two kinds of models
are supported:
- LLM models: these models generate "raw" and "chat" (conversational) completions.
- Embedding models: these models generate embeddings to be used for semantic
search.
x-displayName: >-
Llama Stack Inference API for generating completions, chat completions, and
embeddings.
- name: Inspect
- name: Models
- name: PostTraining (Coming Soon)
- name: Providers
x-displayName: >-
Providers API for inspecting, listing, and modifying providers and their configurations.
- name: Safety
- name: Scoring
- name: ScoringFunctions
- name: Shields
- name: SyntheticDataGeneration (Coming Soon)
- name: Telemetry
- name: ToolGroups
- name: ToolRuntime
- name: VectorDBs
- name: VectorIO
x-tagGroups:
- name: Operations
tags:
- Agents
- BatchInference (Coming Soon)
- Benchmarks
- DatasetIO
- Datasets
- Eval
- Files
- Inference
- Inspect
- Models
- PostTraining (Coming Soon)
- Providers
- Safety
- Scoring
- ScoringFunctions
- Shields
- SyntheticDataGeneration (Coming Soon)
- Telemetry
- ToolGroups
- ToolRuntime
- VectorDBs
- VectorIO