openapi: 3.1.0 info: title: Llama Stack Specification version: v1 description: >- This is the specification of the Llama Stack that provides a set of endpoints and their corresponding interfaces that are tailored to best leverage Llama Models. servers: - url: http://any-hosted-llama-stack.com paths: /v1/datasetio/rows: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/PaginatedRowsResult' tags: - DatasetIO parameters: - name: dataset_id in: query required: true schema: type: string - name: rows_in_page in: query required: true schema: type: integer - name: page_token in: query required: false schema: type: string - name: filter_condition in: query required: false schema: type: string post: responses: '200': description: OK tags: - DatasetIO parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/AppendRowsRequest' required: true /v1/batch-inference/chat-completion: post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/BatchChatCompletionResponse' tags: - BatchInference (Coming Soon) parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/BatchChatCompletionRequest' required: true /v1/batch-inference/completion: post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/BatchCompletionResponse' tags: - BatchInference (Coming Soon) parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/BatchCompletionRequest' required: true /v1/post-training/job/cancel: post: responses: '200': description: OK tags: - PostTraining (Coming Soon) parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/CancelTrainingJobRequest' required: true /v1/inference/chat-completion: post: responses: '200': description: >- If stream=False, returns a ChatCompletionResponse with the full completion. If stream=True, returns an SSE event stream of ChatCompletionResponseStreamChunk content: application/json: schema: $ref: '#/components/schemas/ChatCompletionResponse' text/event-stream: schema: $ref: '#/components/schemas/ChatCompletionResponseStreamChunk' tags: - Inference summary: >- Generate a chat completion for the given messages using the specified model. parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/ChatCompletionRequest' required: true /v1/inference/completion: post: responses: '200': description: >- If stream=False, returns a CompletionResponse with the full completion. If stream=True, returns an SSE event stream of CompletionResponseStreamChunk content: application/json: schema: $ref: '#/components/schemas/CompletionResponse' text/event-stream: schema: $ref: '#/components/schemas/CompletionResponseStreamChunk' tags: - Inference summary: >- Generate a completion for the given content using the specified model. parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/CompletionRequest' required: true /v1/agents: post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/AgentCreateResponse' tags: - Agents parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/CreateAgentRequest' required: true /v1/agents/{agent_id}/session: post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/AgentSessionCreateResponse' tags: - Agents parameters: - name: agent_id in: path required: true schema: type: string requestBody: content: application/json: schema: $ref: '#/components/schemas/CreateAgentSessionRequest' required: true /v1/agents/{agent_id}/session/{session_id}/turn: post: responses: '200': description: >- A single turn in an interaction with an Agentic System. **OR** streamed agent turn completion response. content: application/json: schema: $ref: '#/components/schemas/Turn' text/event-stream: schema: $ref: '#/components/schemas/AgentTurnResponseStreamChunk' tags: - Agents parameters: - name: agent_id in: path required: true schema: type: string - name: session_id in: path required: true schema: type: string requestBody: content: application/json: schema: $ref: '#/components/schemas/CreateAgentTurnRequest' required: true /v1/agents/{agent_id}: delete: responses: '200': description: OK tags: - Agents parameters: - name: agent_id in: path required: true schema: type: string /v1/agents/{agent_id}/session/{session_id}: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/Session' tags: - Agents parameters: - name: session_id in: path required: true schema: type: string - name: agent_id in: path required: true schema: type: string - name: turn_ids in: query required: false schema: type: array items: type: string delete: responses: '200': description: OK tags: - Agents parameters: - name: session_id in: path required: true schema: type: string - name: agent_id in: path required: true schema: type: string /v1/inference/embeddings: post: responses: '200': description: >- An array of embeddings, one for each content. Each embedding is a list of floats. The dimensionality of the embedding is model-specific; you can check model metadata using /models/{model_id} content: application/json: schema: $ref: '#/components/schemas/EmbeddingsResponse' tags: - Inference summary: >- Generate embeddings for content pieces using the specified model. parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/EmbeddingsRequest' required: true /v1/eval/tasks/{task_id}/evaluations: post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/EvaluateResponse' tags: - Eval parameters: - name: task_id in: path required: true schema: type: string requestBody: content: application/json: schema: $ref: '#/components/schemas/EvaluateRowsRequest' required: true /v1/agents/{agent_id}/session/{session_id}/turn/{turn_id}/step/{step_id}: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/AgentStepResponse' tags: - Agents parameters: - name: agent_id in: path required: true schema: type: string - name: session_id in: path required: true schema: type: string - name: turn_id in: path required: true schema: type: string - name: step_id in: path required: true schema: type: string /v1/agents/{agent_id}/session/{session_id}/turn/{turn_id}: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/Turn' tags: - Agents parameters: - name: agent_id in: path required: true schema: type: string - name: session_id in: path required: true schema: type: string - name: turn_id in: path required: true schema: type: string /v1/datasets/{dataset_id}: get: responses: '200': description: OK content: application/json: schema: oneOf: - $ref: '#/components/schemas/Dataset' - type: 'null' tags: - Datasets parameters: - name: dataset_id in: path required: true schema: type: string delete: responses: '200': description: OK tags: - Datasets parameters: - name: dataset_id in: path required: true schema: type: string /v1/eval-tasks/{eval_task_id}: get: responses: '200': description: OK content: application/json: schema: oneOf: - $ref: '#/components/schemas/EvalTask' - type: 'null' tags: - EvalTasks parameters: - name: eval_task_id in: path required: true schema: type: string /v1/models/{model_id}: get: responses: '200': description: OK content: application/json: schema: oneOf: - $ref: '#/components/schemas/Model' - type: 'null' tags: - Models parameters: - name: model_id in: path required: true schema: type: string delete: responses: '200': description: OK tags: - Models parameters: - name: model_id in: path required: true schema: type: string /v1/scoring-functions/{scoring_fn_id}: get: responses: '200': description: OK content: application/json: schema: oneOf: - $ref: '#/components/schemas/ScoringFn' - type: 'null' tags: - ScoringFunctions parameters: - name: scoring_fn_id in: path required: true schema: type: string /v1/shields/{identifier}: get: responses: '200': description: OK content: application/json: schema: oneOf: - $ref: '#/components/schemas/Shield' - type: 'null' tags: - Shields parameters: - name: identifier in: path required: true schema: type: string /v1/telemetry/traces/{trace_id}/spans/{span_id}: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/Span' tags: - Telemetry parameters: - name: trace_id in: path required: true schema: type: string - name: span_id in: path required: true schema: type: string /v1/telemetry/spans/{span_id}/tree: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/QuerySpanTreeResponse' tags: - Telemetry parameters: - name: span_id in: path required: true schema: type: string - name: attributes_to_return in: query required: false schema: type: array items: type: string - name: max_depth in: query required: false schema: type: integer /v1/tools/{tool_name}: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/Tool' tags: - ToolGroups parameters: - name: tool_name in: path required: true schema: type: string /v1/toolgroups/{toolgroup_id}: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/ToolGroup' tags: - ToolGroups parameters: - name: toolgroup_id in: path required: true schema: type: string delete: responses: '200': description: OK tags: - ToolGroups summary: Unregister a tool group parameters: - name: toolgroup_id in: path required: true schema: type: string /v1/telemetry/traces/{trace_id}: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/Trace' tags: - Telemetry parameters: - name: trace_id in: path required: true schema: type: string /v1/post-training/job/artifacts: get: responses: '200': description: OK content: application/json: schema: oneOf: - $ref: '#/components/schemas/PostTrainingJobArtifactsResponse' - type: 'null' tags: - PostTraining (Coming Soon) parameters: - name: job_uuid in: query required: true schema: type: string /v1/post-training/job/status: get: responses: '200': description: OK content: application/json: schema: oneOf: - $ref: '#/components/schemas/PostTrainingJobStatusResponse' - type: 'null' tags: - PostTraining (Coming Soon) parameters: - name: job_uuid in: query required: true schema: type: string /v1/post-training/jobs: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/ListPostTrainingJobsResponse' tags: - PostTraining (Coming Soon) parameters: [] /v1/vector-dbs/{vector_db_id}: get: responses: '200': description: OK content: application/json: schema: oneOf: - $ref: '#/components/schemas/VectorDB' - type: 'null' tags: - VectorDBs parameters: - name: vector_db_id in: path required: true schema: type: string delete: responses: '200': description: OK tags: - VectorDBs parameters: - name: vector_db_id in: path required: true schema: type: string /v1/health: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/HealthInfo' tags: - Inspect parameters: [] /v1/tool-runtime/rag-tool/insert: post: responses: '200': description: OK tags: - ToolRuntime summary: >- Index documents so they can be used by the RAG system parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/InsertRequest' required: true /v1/vector-io/insert: post: responses: '200': description: OK tags: - VectorIO parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/InsertChunksRequest' required: true /v1/tool-runtime/invoke: post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/ToolInvocationResult' tags: - ToolRuntime summary: Run a tool with the given arguments parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/InvokeToolRequest' required: true /v1/eval/tasks/{task_id}/jobs/{job_id}: get: responses: '200': description: OK content: application/json: schema: oneOf: - $ref: '#/components/schemas/JobStatus' - type: 'null' tags: - Eval parameters: - name: task_id in: path required: true schema: type: string - name: job_id in: path required: true schema: type: string delete: responses: '200': description: OK tags: - Eval parameters: - name: task_id in: path required: true schema: type: string - name: job_id in: path required: true schema: type: string /v1/eval/tasks/{task_id}/jobs/{job_id}/result: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/EvaluateResponse' tags: - Eval parameters: - name: job_id in: path required: true schema: type: string - name: task_id in: path required: true schema: type: string /v1/datasets: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/ListDatasetsResponse' tags: - Datasets parameters: [] post: responses: '200': description: OK tags: - Datasets parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/RegisterDatasetRequest' required: true /v1/eval-tasks: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/ListEvalTasksResponse' tags: - EvalTasks parameters: [] post: responses: '200': description: OK tags: - EvalTasks parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/RegisterEvalTaskRequest' required: true /v1/models: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/ListModelsResponse' tags: - Models parameters: [] post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/Model' tags: - Models parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/RegisterModelRequest' required: true /v1/inspect/providers: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/ListProvidersResponse' tags: - Inspect parameters: [] /v1/inspect/routes: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/ListRoutesResponse' tags: - Inspect parameters: [] /v1/tool-runtime/list-tools: get: responses: '200': description: OK content: application/jsonl: schema: $ref: '#/components/schemas/ToolDef' tags: - ToolRuntime parameters: - name: tool_group_id in: query required: false schema: type: string - name: mcp_endpoint in: query required: false schema: $ref: '#/components/schemas/URL' /v1/scoring-functions: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/ListScoringFunctionsResponse' tags: - ScoringFunctions parameters: [] post: responses: '200': description: OK tags: - ScoringFunctions parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/RegisterScoringFunctionRequest' required: true /v1/shields: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/ListShieldsResponse' tags: - Shields parameters: [] post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/Shield' tags: - Shields parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/RegisterShieldRequest' required: true /v1/toolgroups: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/ListToolGroupsResponse' tags: - ToolGroups summary: List tool groups with optional provider parameters: [] post: responses: '200': description: OK tags: - ToolGroups summary: Register a tool group parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/RegisterToolGroupRequest' required: true /v1/tools: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/ListToolsResponse' tags: - ToolGroups summary: List tools with optional tool group parameters: - name: toolgroup_id in: query required: false schema: type: string /v1/vector-dbs: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/ListVectorDBsResponse' tags: - VectorDBs parameters: [] post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/VectorDB' tags: - VectorDBs parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/RegisterVectorDbRequest' required: true /v1/telemetry/events: post: responses: '200': description: OK tags: - Telemetry parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/LogEventRequest' required: true /v1/post-training/preference-optimize: post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/PostTrainingJob' tags: - PostTraining (Coming Soon) parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/PreferenceOptimizeRequest' required: true /v1/tool-runtime/rag-tool/query: post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/RAGQueryResult' tags: - ToolRuntime summary: >- Query the RAG system for context; typically invoked by the agent parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/QueryRequest' required: true /v1/vector-io/query: post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/QueryChunksResponse' tags: - VectorIO parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/QueryChunksRequest' required: true /v1/telemetry/spans: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/QuerySpansResponse' tags: - Telemetry parameters: - name: attribute_filters in: query required: true schema: type: array items: $ref: '#/components/schemas/QueryCondition' - name: attributes_to_return in: query required: true schema: type: array items: type: string - name: max_depth in: query required: false schema: type: integer /v1/telemetry/traces: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/QueryTracesResponse' tags: - Telemetry parameters: - name: attribute_filters in: query required: false schema: type: array items: $ref: '#/components/schemas/QueryCondition' - name: limit in: query required: false schema: type: integer - name: offset in: query required: false schema: type: integer - name: order_by in: query required: false schema: type: array items: type: string /v1/eval/tasks/{task_id}/jobs: post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/Job' tags: - Eval parameters: - name: task_id in: path required: true schema: type: string requestBody: content: application/json: schema: $ref: '#/components/schemas/RunEvalRequest' required: true /v1/safety/run-shield: post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/RunShieldResponse' tags: - Safety parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/RunShieldRequest' required: true /v1/telemetry/spans/export: post: responses: '200': description: OK tags: - Telemetry parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/SaveSpansToDatasetRequest' required: true /v1/scoring/score: post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/ScoreResponse' tags: - Scoring parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/ScoreRequest' required: true /v1/scoring/score-batch: post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/ScoreBatchResponse' tags: - Scoring parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/ScoreBatchRequest' required: true /v1/post-training/supervised-fine-tune: post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/PostTrainingJob' tags: - PostTraining (Coming Soon) parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/SupervisedFineTuneRequest' required: true /v1/synthetic-data-generation/generate: post: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/SyntheticDataGenerationResponse' tags: - SyntheticDataGeneration (Coming Soon) parameters: [] requestBody: content: application/json: schema: $ref: '#/components/schemas/SyntheticDataGenerateRequest' required: true /v1/version: get: responses: '200': description: OK content: application/json: schema: $ref: '#/components/schemas/VersionInfo' tags: - Inspect parameters: [] jsonSchemaDialect: >- https://json-schema.org/draft/2020-12/schema components: schemas: AppendRowsRequest: type: object properties: dataset_id: type: string rows: type: array items: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - dataset_id - rows CompletionMessage: type: object properties: role: type: string const: assistant default: assistant description: >- Must be "assistant" to identify this as the model's response content: $ref: '#/components/schemas/InterleavedContent' description: The content of the model's response stop_reason: type: string enum: - end_of_turn - end_of_message - out_of_tokens description: >- Reason why the model stopped generating. Options are: - `StopReason.end_of_turn`: The model finished generating the entire response. - `StopReason.end_of_message`: The model finished generating but generated a partial response -- usually, a tool call. The user may call the tool and continue the conversation with the tool's response. - `StopReason.out_of_tokens`: The model ran out of token budget. tool_calls: type: array items: $ref: '#/components/schemas/ToolCall' description: >- List of tool calls. Each tool call is a ToolCall object. additionalProperties: false required: - role - content - stop_reason - tool_calls title: >- A message containing the model's (assistant) response in a chat conversation. GrammarResponseFormat: type: object properties: type: type: string const: grammar default: grammar description: >- Must be "grammar" to identify this format type bnf: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object description: >- The BNF grammar specification the response should conform to additionalProperties: false required: - type - bnf title: >- Configuration for grammar-guided response generation. GreedySamplingStrategy: type: object properties: type: type: string const: greedy default: greedy additionalProperties: false required: - type ImageContentItem: type: object properties: type: type: string const: image default: image description: >- Discriminator type of the content item. Always "image" image: type: object properties: url: $ref: '#/components/schemas/URL' description: >- A URL of the image or data URL in the format of data:image/{type};base64,{data}. Note that URL could have length limits. data: type: string contentEncoding: base64 description: base64 encoded image data as string additionalProperties: false description: >- Image as a base64 encoded string or an URL additionalProperties: false required: - type - image title: A image content item InterleavedContent: oneOf: - type: string - $ref: '#/components/schemas/InterleavedContentItem' - type: array items: $ref: '#/components/schemas/InterleavedContentItem' InterleavedContentItem: oneOf: - $ref: '#/components/schemas/ImageContentItem' - $ref: '#/components/schemas/TextContentItem' discriminator: propertyName: type mapping: image: '#/components/schemas/ImageContentItem' text: '#/components/schemas/TextContentItem' JsonSchemaResponseFormat: type: object properties: type: type: string const: json_schema default: json_schema description: >- Must be "json_schema" to identify this format type json_schema: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object description: >- The JSON schema the response should conform to. In a Python SDK, this is often a `pydantic` model. additionalProperties: false required: - type - json_schema title: >- Configuration for JSON schema-guided response generation. Message: oneOf: - $ref: '#/components/schemas/UserMessage' - $ref: '#/components/schemas/SystemMessage' - $ref: '#/components/schemas/ToolResponseMessage' - $ref: '#/components/schemas/CompletionMessage' discriminator: propertyName: role mapping: user: '#/components/schemas/UserMessage' system: '#/components/schemas/SystemMessage' tool: '#/components/schemas/ToolResponseMessage' assistant: '#/components/schemas/CompletionMessage' ResponseFormat: oneOf: - $ref: '#/components/schemas/JsonSchemaResponseFormat' - $ref: '#/components/schemas/GrammarResponseFormat' discriminator: propertyName: type mapping: json_schema: '#/components/schemas/JsonSchemaResponseFormat' grammar: '#/components/schemas/GrammarResponseFormat' SamplingParams: type: object properties: strategy: $ref: '#/components/schemas/SamplingStrategy' max_tokens: type: integer default: 0 repetition_penalty: type: number default: 1.0 additionalProperties: false required: - strategy SamplingStrategy: oneOf: - $ref: '#/components/schemas/GreedySamplingStrategy' - $ref: '#/components/schemas/TopPSamplingStrategy' - $ref: '#/components/schemas/TopKSamplingStrategy' discriminator: propertyName: type mapping: greedy: '#/components/schemas/GreedySamplingStrategy' top_p: '#/components/schemas/TopPSamplingStrategy' top_k: '#/components/schemas/TopKSamplingStrategy' SystemMessage: type: object properties: role: type: string const: system default: system description: >- Must be "system" to identify this as a system message content: $ref: '#/components/schemas/InterleavedContent' description: >- The content of the "system prompt". If multiple system messages are provided, they are concatenated. The underlying Llama Stack code may also add other system messages (for example, for formatting tool definitions). additionalProperties: false required: - role - content title: >- A system message providing instructions or context to the model. TextContentItem: type: object properties: type: type: string const: text default: text description: >- Discriminator type of the content item. Always "text" text: type: string description: Text content additionalProperties: false required: - type - text title: A text content item ToolCall: type: object properties: call_id: type: string tool_name: oneOf: - type: string enum: - brave_search - wolfram_alpha - photogen - code_interpreter - type: string arguments: type: object additionalProperties: oneOf: - type: string - type: integer - type: number - type: boolean - type: 'null' - type: array items: oneOf: - type: string - type: integer - type: number - type: boolean - type: 'null' - type: object additionalProperties: oneOf: - type: string - type: integer - type: number - type: boolean - type: 'null' additionalProperties: false required: - call_id - tool_name - arguments ToolDefinition: type: object properties: tool_name: oneOf: - type: string enum: - brave_search - wolfram_alpha - photogen - code_interpreter - type: string description: type: string parameters: type: object additionalProperties: $ref: '#/components/schemas/ToolParamDefinition' additionalProperties: false required: - tool_name ToolParamDefinition: type: object properties: param_type: type: string description: type: string required: type: boolean default: true default: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - param_type ToolResponseMessage: type: object properties: role: type: string const: tool default: tool description: >- Must be "tool" to identify this as a tool response call_id: type: string description: >- Unique identifier for the tool call this response is for tool_name: oneOf: - type: string enum: - brave_search - wolfram_alpha - photogen - code_interpreter - type: string description: Name of the tool that was called content: $ref: '#/components/schemas/InterleavedContent' description: The response content from the tool additionalProperties: false required: - role - call_id - tool_name - content title: >- A message representing the result of a tool invocation. TopKSamplingStrategy: type: object properties: type: type: string const: top_k default: top_k top_k: type: integer additionalProperties: false required: - type - top_k TopPSamplingStrategy: type: object properties: type: type: string const: top_p default: top_p temperature: type: number top_p: type: number default: 0.95 additionalProperties: false required: - type URL: type: object properties: uri: type: string additionalProperties: false required: - uri UserMessage: type: object properties: role: type: string const: user default: user description: >- Must be "user" to identify this as a user message content: $ref: '#/components/schemas/InterleavedContent' description: >- The content of the message, which can include text and other media context: $ref: '#/components/schemas/InterleavedContent' description: >- (Optional) This field is used internally by Llama Stack to pass RAG context. This field may be removed in the API in the future. additionalProperties: false required: - role - content title: >- A message from the user in a chat conversation. BatchChatCompletionRequest: type: object properties: model: type: string messages_batch: type: array items: type: array items: $ref: '#/components/schemas/Message' sampling_params: $ref: '#/components/schemas/SamplingParams' tools: type: array items: $ref: '#/components/schemas/ToolDefinition' tool_choice: type: string enum: - auto - required title: >- Whether tool use is required or automatic. This is a hint to the model which may not be followed. It depends on the Instruction Following capabilities of the model. tool_prompt_format: type: string enum: - json - function_tag - python_list title: >- Prompt format for calling custom / zero shot tools. response_format: $ref: '#/components/schemas/ResponseFormat' logprobs: type: object properties: top_k: type: integer default: 0 description: >- How many tokens (for each position) to return log probabilities for. additionalProperties: false additionalProperties: false required: - model - messages_batch BatchChatCompletionResponse: type: object properties: batch: type: array items: $ref: '#/components/schemas/ChatCompletionResponse' additionalProperties: false required: - batch ChatCompletionResponse: type: object properties: completion_message: $ref: '#/components/schemas/CompletionMessage' description: The complete response message logprobs: type: array items: $ref: '#/components/schemas/TokenLogProbs' description: >- Optional log probabilities for generated tokens additionalProperties: false required: - completion_message title: Response from a chat completion request. TokenLogProbs: type: object properties: logprobs_by_token: type: object additionalProperties: type: number description: >- Dictionary mapping tokens to their log probabilities additionalProperties: false required: - logprobs_by_token title: Log probabilities for generated tokens. BatchCompletionRequest: type: object properties: model: type: string content_batch: type: array items: $ref: '#/components/schemas/InterleavedContent' sampling_params: $ref: '#/components/schemas/SamplingParams' response_format: $ref: '#/components/schemas/ResponseFormat' logprobs: type: object properties: top_k: type: integer default: 0 description: >- How many tokens (for each position) to return log probabilities for. additionalProperties: false additionalProperties: false required: - model - content_batch BatchCompletionResponse: type: object properties: batch: type: array items: $ref: '#/components/schemas/CompletionResponse' additionalProperties: false required: - batch CompletionResponse: type: object properties: content: type: string description: The generated completion text stop_reason: type: string enum: - end_of_turn - end_of_message - out_of_tokens description: Reason why generation stopped logprobs: type: array items: $ref: '#/components/schemas/TokenLogProbs' description: >- Optional log probabilities for generated tokens additionalProperties: false required: - content - stop_reason title: Response from a completion request. CancelTrainingJobRequest: type: object properties: job_uuid: type: string additionalProperties: false required: - job_uuid ChatCompletionRequest: type: object properties: model_id: type: string description: >- The identifier of the model to use. The model must be registered with Llama Stack and available via the /models endpoint. messages: type: array items: $ref: '#/components/schemas/Message' description: List of messages in the conversation sampling_params: $ref: '#/components/schemas/SamplingParams' description: >- Parameters to control the sampling strategy tools: type: array items: $ref: '#/components/schemas/ToolDefinition' description: >- (Optional) List of tool definitions available to the model tool_choice: type: string enum: - auto - required description: >- (Optional) Whether tool use is required or automatic. Defaults to ToolChoice.auto. tool_prompt_format: type: string enum: - json - function_tag - python_list description: >- (Optional) Instructs the model how to format tool calls. By default, Llama Stack will attempt to use a format that is best adapted to the model. - `ToolPromptFormat.json`: The tool calls are formatted as a JSON object. - `ToolPromptFormat.function_tag`: The tool calls are enclosed in a tag. - `ToolPromptFormat.python_list`: The tool calls are output as Python syntax -- a list of function calls. response_format: $ref: '#/components/schemas/ResponseFormat' description: >- (Optional) Grammar specification for guided (structured) decoding. There are two options: - `ResponseFormat.json_schema`: The grammar is a JSON schema. Most providers support this format. - `ResponseFormat.grammar`: The grammar is a BNF grammar. This format is more flexible, but not all providers support it. stream: type: boolean description: >- (Optional) If True, generate an SSE event stream of the response. Defaults to False. logprobs: type: object properties: top_k: type: integer default: 0 description: >- How many tokens (for each position) to return log probabilities for. additionalProperties: false description: >- (Optional) If specified, log probabilities for each token position will be returned. additionalProperties: false required: - model_id - messages ChatCompletionResponseEvent: type: object properties: event_type: type: string enum: - start - complete - progress description: Type of the event delta: $ref: '#/components/schemas/ContentDelta' description: >- Content generated since last event. This can be one or more tokens, or a tool call. logprobs: type: array items: $ref: '#/components/schemas/TokenLogProbs' description: >- Optional log probabilities for generated tokens stop_reason: type: string enum: - end_of_turn - end_of_message - out_of_tokens description: >- Optional reason why generation stopped, if complete additionalProperties: false required: - event_type - delta title: >- An event during chat completion generation. ChatCompletionResponseStreamChunk: type: object properties: event: $ref: '#/components/schemas/ChatCompletionResponseEvent' description: The event containing the new content additionalProperties: false required: - event title: >- A chunk of a streamed chat completion response. ContentDelta: oneOf: - $ref: '#/components/schemas/TextDelta' - $ref: '#/components/schemas/ImageDelta' - $ref: '#/components/schemas/ToolCallDelta' discriminator: propertyName: type mapping: text: '#/components/schemas/TextDelta' image: '#/components/schemas/ImageDelta' tool_call: '#/components/schemas/ToolCallDelta' ImageDelta: type: object properties: type: type: string const: image default: image image: type: string contentEncoding: base64 additionalProperties: false required: - type - image TextDelta: type: object properties: type: type: string const: text default: text text: type: string additionalProperties: false required: - type - text ToolCallDelta: type: object properties: type: type: string const: tool_call default: tool_call tool_call: oneOf: - type: string - $ref: '#/components/schemas/ToolCall' parse_status: type: string enum: - started - in_progress - failed - succeeded additionalProperties: false required: - type - tool_call - parse_status CompletionRequest: type: object properties: model_id: type: string description: >- The identifier of the model to use. The model must be registered with Llama Stack and available via the /models endpoint. content: $ref: '#/components/schemas/InterleavedContent' description: The content to generate a completion for sampling_params: $ref: '#/components/schemas/SamplingParams' description: >- (Optional) Parameters to control the sampling strategy response_format: $ref: '#/components/schemas/ResponseFormat' description: >- (Optional) Grammar specification for guided (structured) decoding stream: type: boolean description: >- (Optional) If True, generate an SSE event stream of the response. Defaults to False. logprobs: type: object properties: top_k: type: integer default: 0 description: >- How many tokens (for each position) to return log probabilities for. additionalProperties: false description: >- (Optional) If specified, log probabilities for each token position will be returned. additionalProperties: false required: - model_id - content CompletionResponseStreamChunk: type: object properties: delta: type: string description: >- New content generated since last chunk. This can be one or more tokens. stop_reason: type: string enum: - end_of_turn - end_of_message - out_of_tokens description: >- Optional reason why generation stopped, if complete logprobs: type: array items: $ref: '#/components/schemas/TokenLogProbs' description: >- Optional log probabilities for generated tokens additionalProperties: false required: - delta title: >- A chunk of a streamed completion response. AgentConfig: type: object properties: sampling_params: $ref: '#/components/schemas/SamplingParams' input_shields: type: array items: type: string output_shields: type: array items: type: string toolgroups: type: array items: $ref: '#/components/schemas/AgentTool' client_tools: type: array items: $ref: '#/components/schemas/ToolDef' tool_choice: type: string enum: - auto - required title: >- Whether tool use is required or automatic. This is a hint to the model which may not be followed. It depends on the Instruction Following capabilities of the model. default: auto tool_prompt_format: type: string enum: - json - function_tag - python_list title: >- Prompt format for calling custom / zero shot tools. max_infer_iters: type: integer default: 10 model: type: string instructions: type: string enable_session_persistence: type: boolean response_format: $ref: '#/components/schemas/ResponseFormat' additionalProperties: false required: - max_infer_iters - model - instructions - enable_session_persistence AgentTool: oneOf: - type: string - type: object properties: name: type: string args: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - name - args ToolDef: type: object properties: name: type: string description: type: string parameters: type: array items: $ref: '#/components/schemas/ToolParameter' metadata: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - name ToolParameter: type: object properties: name: type: string parameter_type: type: string description: type: string required: type: boolean default: true default: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - name - parameter_type - description - required CreateAgentRequest: type: object properties: agent_config: $ref: '#/components/schemas/AgentConfig' additionalProperties: false required: - agent_config AgentCreateResponse: type: object properties: agent_id: type: string additionalProperties: false required: - agent_id CreateAgentSessionRequest: type: object properties: session_name: type: string additionalProperties: false required: - session_name AgentSessionCreateResponse: type: object properties: session_id: type: string additionalProperties: false required: - session_id CreateAgentTurnRequest: type: object properties: messages: type: array items: oneOf: - $ref: '#/components/schemas/UserMessage' - $ref: '#/components/schemas/ToolResponseMessage' stream: type: boolean documents: type: array items: type: object properties: content: oneOf: - type: string - $ref: '#/components/schemas/InterleavedContentItem' - type: array items: $ref: '#/components/schemas/InterleavedContentItem' - $ref: '#/components/schemas/URL' mime_type: type: string additionalProperties: false required: - content - mime_type toolgroups: type: array items: $ref: '#/components/schemas/AgentTool' additionalProperties: false required: - messages InferenceStep: type: object properties: turn_id: type: string step_id: type: string started_at: type: string format: date-time completed_at: type: string format: date-time step_type: type: string const: inference default: inference model_response: $ref: '#/components/schemas/CompletionMessage' additionalProperties: false required: - turn_id - step_id - step_type - model_response MemoryRetrievalStep: type: object properties: turn_id: type: string step_id: type: string started_at: type: string format: date-time completed_at: type: string format: date-time step_type: type: string const: memory_retrieval default: memory_retrieval vector_db_ids: type: string inserted_context: $ref: '#/components/schemas/InterleavedContent' additionalProperties: false required: - turn_id - step_id - step_type - vector_db_ids - inserted_context SafetyViolation: type: object properties: violation_level: $ref: '#/components/schemas/ViolationLevel' user_message: type: string metadata: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - violation_level - metadata ShieldCallStep: type: object properties: turn_id: type: string step_id: type: string started_at: type: string format: date-time completed_at: type: string format: date-time step_type: type: string const: shield_call default: shield_call violation: $ref: '#/components/schemas/SafetyViolation' additionalProperties: false required: - turn_id - step_id - step_type ToolExecutionStep: type: object properties: turn_id: type: string step_id: type: string started_at: type: string format: date-time completed_at: type: string format: date-time step_type: type: string const: tool_execution default: tool_execution tool_calls: type: array items: $ref: '#/components/schemas/ToolCall' tool_responses: type: array items: $ref: '#/components/schemas/ToolResponse' additionalProperties: false required: - turn_id - step_id - step_type - tool_calls - tool_responses ToolResponse: type: object properties: call_id: type: string tool_name: oneOf: - type: string enum: - brave_search - wolfram_alpha - photogen - code_interpreter - type: string content: $ref: '#/components/schemas/InterleavedContent' additionalProperties: false required: - call_id - tool_name - content Turn: type: object properties: turn_id: type: string session_id: type: string input_messages: type: array items: oneOf: - $ref: '#/components/schemas/UserMessage' - $ref: '#/components/schemas/ToolResponseMessage' steps: type: array items: oneOf: - $ref: '#/components/schemas/InferenceStep' - $ref: '#/components/schemas/ToolExecutionStep' - $ref: '#/components/schemas/ShieldCallStep' - $ref: '#/components/schemas/MemoryRetrievalStep' discriminator: propertyName: step_type mapping: inference: '#/components/schemas/InferenceStep' tool_execution: '#/components/schemas/ToolExecutionStep' shield_call: '#/components/schemas/ShieldCallStep' memory_retrieval: '#/components/schemas/MemoryRetrievalStep' output_message: $ref: '#/components/schemas/CompletionMessage' output_attachments: type: array items: type: object properties: content: oneOf: - type: string - $ref: '#/components/schemas/InterleavedContentItem' - type: array items: $ref: '#/components/schemas/InterleavedContentItem' - $ref: '#/components/schemas/URL' mime_type: type: string additionalProperties: false required: - content - mime_type started_at: type: string format: date-time completed_at: type: string format: date-time additionalProperties: false required: - turn_id - session_id - input_messages - steps - output_message - output_attachments - started_at title: >- A single turn in an interaction with an Agentic System. ViolationLevel: type: string enum: - info - warn - error AgentTurnResponseEvent: type: object properties: payload: $ref: '#/components/schemas/AgentTurnResponseEventPayload' additionalProperties: false required: - payload AgentTurnResponseEventPayload: oneOf: - $ref: '#/components/schemas/AgentTurnResponseStepStartPayload' - $ref: '#/components/schemas/AgentTurnResponseStepProgressPayload' - $ref: '#/components/schemas/AgentTurnResponseStepCompletePayload' - $ref: '#/components/schemas/AgentTurnResponseTurnStartPayload' - $ref: '#/components/schemas/AgentTurnResponseTurnCompletePayload' discriminator: propertyName: event_type mapping: step_start: '#/components/schemas/AgentTurnResponseStepStartPayload' step_progress: '#/components/schemas/AgentTurnResponseStepProgressPayload' step_complete: '#/components/schemas/AgentTurnResponseStepCompletePayload' turn_start: '#/components/schemas/AgentTurnResponseTurnStartPayload' turn_complete: '#/components/schemas/AgentTurnResponseTurnCompletePayload' AgentTurnResponseStepCompletePayload: type: object properties: event_type: type: string const: step_complete default: step_complete step_type: type: string enum: - inference - tool_execution - shield_call - memory_retrieval step_id: type: string step_details: oneOf: - $ref: '#/components/schemas/InferenceStep' - $ref: '#/components/schemas/ToolExecutionStep' - $ref: '#/components/schemas/ShieldCallStep' - $ref: '#/components/schemas/MemoryRetrievalStep' discriminator: propertyName: step_type mapping: inference: '#/components/schemas/InferenceStep' tool_execution: '#/components/schemas/ToolExecutionStep' shield_call: '#/components/schemas/ShieldCallStep' memory_retrieval: '#/components/schemas/MemoryRetrievalStep' additionalProperties: false required: - event_type - step_type - step_id - step_details AgentTurnResponseStepProgressPayload: type: object properties: event_type: type: string const: step_progress default: step_progress step_type: type: string enum: - inference - tool_execution - shield_call - memory_retrieval step_id: type: string delta: $ref: '#/components/schemas/ContentDelta' additionalProperties: false required: - event_type - step_type - step_id - delta AgentTurnResponseStepStartPayload: type: object properties: event_type: type: string const: step_start default: step_start step_type: type: string enum: - inference - tool_execution - shield_call - memory_retrieval step_id: type: string metadata: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - event_type - step_type - step_id AgentTurnResponseStreamChunk: type: object properties: event: $ref: '#/components/schemas/AgentTurnResponseEvent' additionalProperties: false required: - event title: streamed agent turn completion response. AgentTurnResponseTurnCompletePayload: type: object properties: event_type: type: string const: turn_complete default: turn_complete turn: $ref: '#/components/schemas/Turn' additionalProperties: false required: - event_type - turn AgentTurnResponseTurnStartPayload: type: object properties: event_type: type: string const: turn_start default: turn_start turn_id: type: string additionalProperties: false required: - event_type - turn_id EmbeddingsRequest: type: object properties: model_id: type: string description: >- The identifier of the model to use. The model must be an embedding model registered with Llama Stack and available via the /models endpoint. contents: type: array items: $ref: '#/components/schemas/InterleavedContent' description: >- List of contents to generate embeddings for. Note that content can be multimodal. The behavior depends on the model and provider. Some models may only support text. additionalProperties: false required: - model_id - contents EmbeddingsResponse: type: object properties: embeddings: type: array items: type: array items: type: number description: >- List of embedding vectors, one per input content. Each embedding is a list of floats. The dimensionality of the embedding is model-specific; you can check model metadata using /models/{model_id} additionalProperties: false required: - embeddings title: >- Response containing generated embeddings. AgentCandidate: type: object properties: type: type: string const: agent default: agent config: $ref: '#/components/schemas/AgentConfig' additionalProperties: false required: - type - config AggregationFunctionType: type: string enum: - average - median - categorical_count - accuracy AppEvalTaskConfig: type: object properties: type: type: string const: app default: app eval_candidate: $ref: '#/components/schemas/EvalCandidate' scoring_params: type: object additionalProperties: $ref: '#/components/schemas/ScoringFnParams' num_examples: type: integer additionalProperties: false required: - type - eval_candidate - scoring_params BasicScoringFnParams: type: object properties: type: type: string const: basic default: basic aggregation_functions: type: array items: $ref: '#/components/schemas/AggregationFunctionType' additionalProperties: false required: - type BenchmarkEvalTaskConfig: type: object properties: type: type: string const: benchmark default: benchmark eval_candidate: $ref: '#/components/schemas/EvalCandidate' num_examples: type: integer additionalProperties: false required: - type - eval_candidate EvalCandidate: oneOf: - $ref: '#/components/schemas/ModelCandidate' - $ref: '#/components/schemas/AgentCandidate' discriminator: propertyName: type mapping: model: '#/components/schemas/ModelCandidate' agent: '#/components/schemas/AgentCandidate' EvalTaskConfig: oneOf: - $ref: '#/components/schemas/BenchmarkEvalTaskConfig' - $ref: '#/components/schemas/AppEvalTaskConfig' discriminator: propertyName: type mapping: benchmark: '#/components/schemas/BenchmarkEvalTaskConfig' app: '#/components/schemas/AppEvalTaskConfig' LLMAsJudgeScoringFnParams: type: object properties: type: type: string const: llm_as_judge default: llm_as_judge judge_model: type: string prompt_template: type: string judge_score_regexes: type: array items: type: string aggregation_functions: type: array items: $ref: '#/components/schemas/AggregationFunctionType' additionalProperties: false required: - type - judge_model ModelCandidate: type: object properties: type: type: string const: model default: model model: type: string sampling_params: $ref: '#/components/schemas/SamplingParams' system_message: $ref: '#/components/schemas/SystemMessage' additionalProperties: false required: - type - model - sampling_params RegexParserScoringFnParams: type: object properties: type: type: string const: regex_parser default: regex_parser parsing_regexes: type: array items: type: string aggregation_functions: type: array items: $ref: '#/components/schemas/AggregationFunctionType' additionalProperties: false required: - type ScoringFnParams: oneOf: - $ref: '#/components/schemas/LLMAsJudgeScoringFnParams' - $ref: '#/components/schemas/RegexParserScoringFnParams' - $ref: '#/components/schemas/BasicScoringFnParams' discriminator: propertyName: type mapping: llm_as_judge: '#/components/schemas/LLMAsJudgeScoringFnParams' regex_parser: '#/components/schemas/RegexParserScoringFnParams' basic: '#/components/schemas/BasicScoringFnParams' EvaluateRowsRequest: type: object properties: input_rows: type: array items: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object scoring_functions: type: array items: type: string task_config: $ref: '#/components/schemas/EvalTaskConfig' additionalProperties: false required: - input_rows - scoring_functions - task_config EvaluateResponse: type: object properties: generations: type: array items: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object scores: type: object additionalProperties: $ref: '#/components/schemas/ScoringResult' additionalProperties: false required: - generations - scores ScoringResult: type: object properties: score_rows: type: array items: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object aggregated_results: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - score_rows - aggregated_results Session: type: object properties: session_id: type: string session_name: type: string turns: type: array items: $ref: '#/components/schemas/Turn' started_at: type: string format: date-time additionalProperties: false required: - session_id - session_name - turns - started_at title: >- A single session of an interaction with an Agentic System. AgentStepResponse: type: object properties: step: oneOf: - $ref: '#/components/schemas/InferenceStep' - $ref: '#/components/schemas/ToolExecutionStep' - $ref: '#/components/schemas/ShieldCallStep' - $ref: '#/components/schemas/MemoryRetrievalStep' discriminator: propertyName: step_type mapping: inference: '#/components/schemas/InferenceStep' tool_execution: '#/components/schemas/ToolExecutionStep' shield_call: '#/components/schemas/ShieldCallStep' memory_retrieval: '#/components/schemas/MemoryRetrievalStep' additionalProperties: false required: - step AgentTurnInputType: type: object properties: type: type: string const: agent_turn_input default: agent_turn_input additionalProperties: false required: - type ArrayType: type: object properties: type: type: string const: array default: array additionalProperties: false required: - type BooleanType: type: object properties: type: type: string const: boolean default: boolean additionalProperties: false required: - type ChatCompletionInputType: type: object properties: type: type: string const: chat_completion_input default: chat_completion_input additionalProperties: false required: - type CompletionInputType: type: object properties: type: type: string const: completion_input default: completion_input additionalProperties: false required: - type Dataset: type: object properties: identifier: type: string provider_resource_id: type: string provider_id: type: string type: type: string const: dataset default: dataset dataset_schema: type: object additionalProperties: $ref: '#/components/schemas/ParamType' url: $ref: '#/components/schemas/URL' metadata: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - identifier - provider_resource_id - provider_id - type - dataset_schema - url - metadata JsonType: type: object properties: type: type: string const: json default: json additionalProperties: false required: - type NumberType: type: object properties: type: type: string const: number default: number additionalProperties: false required: - type ObjectType: type: object properties: type: type: string const: object default: object additionalProperties: false required: - type ParamType: oneOf: - $ref: '#/components/schemas/StringType' - $ref: '#/components/schemas/NumberType' - $ref: '#/components/schemas/BooleanType' - $ref: '#/components/schemas/ArrayType' - $ref: '#/components/schemas/ObjectType' - $ref: '#/components/schemas/JsonType' - $ref: '#/components/schemas/UnionType' - $ref: '#/components/schemas/ChatCompletionInputType' - $ref: '#/components/schemas/CompletionInputType' - $ref: '#/components/schemas/AgentTurnInputType' discriminator: propertyName: type mapping: string: '#/components/schemas/StringType' number: '#/components/schemas/NumberType' boolean: '#/components/schemas/BooleanType' array: '#/components/schemas/ArrayType' object: '#/components/schemas/ObjectType' json: '#/components/schemas/JsonType' union: '#/components/schemas/UnionType' chat_completion_input: '#/components/schemas/ChatCompletionInputType' completion_input: '#/components/schemas/CompletionInputType' agent_turn_input: '#/components/schemas/AgentTurnInputType' StringType: type: object properties: type: type: string const: string default: string additionalProperties: false required: - type UnionType: type: object properties: type: type: string const: union default: union additionalProperties: false required: - type EvalTask: type: object properties: identifier: type: string provider_resource_id: type: string provider_id: type: string type: type: string const: eval_task default: eval_task dataset_id: type: string scoring_functions: type: array items: type: string metadata: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - identifier - provider_resource_id - provider_id - type - dataset_id - scoring_functions - metadata Model: type: object properties: identifier: type: string provider_resource_id: type: string provider_id: type: string type: type: string const: model default: model metadata: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object model_type: $ref: '#/components/schemas/ModelType' default: llm additionalProperties: false required: - identifier - provider_resource_id - provider_id - type - metadata - model_type ModelType: type: string enum: - llm - embedding PaginatedRowsResult: type: object properties: rows: type: array items: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object total_count: type: integer next_page_token: type: string additionalProperties: false required: - rows - total_count ScoringFn: type: object properties: identifier: type: string provider_resource_id: type: string provider_id: type: string type: type: string const: scoring_function default: scoring_function description: type: string metadata: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object return_type: $ref: '#/components/schemas/ParamType' params: $ref: '#/components/schemas/ScoringFnParams' additionalProperties: false required: - identifier - provider_resource_id - provider_id - type - metadata - return_type Shield: type: object properties: identifier: type: string provider_resource_id: type: string provider_id: type: string type: type: string const: shield default: shield params: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - identifier - provider_resource_id - provider_id - type title: >- A safety shield resource that can be used to check content Span: type: object properties: span_id: type: string trace_id: type: string parent_span_id: type: string name: type: string start_time: type: string format: date-time end_time: type: string format: date-time attributes: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - span_id - trace_id - name - start_time SpanStatus: type: string enum: - ok - error SpanWithStatus: type: object properties: span_id: type: string trace_id: type: string parent_span_id: type: string name: type: string start_time: type: string format: date-time end_time: type: string format: date-time attributes: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object status: $ref: '#/components/schemas/SpanStatus' additionalProperties: false required: - span_id - trace_id - name - start_time QuerySpanTreeResponse: type: object properties: data: type: object additionalProperties: $ref: '#/components/schemas/SpanWithStatus' additionalProperties: false required: - data Tool: type: object properties: identifier: type: string provider_resource_id: type: string provider_id: type: string type: type: string const: tool default: tool toolgroup_id: type: string tool_host: $ref: '#/components/schemas/ToolHost' description: type: string parameters: type: array items: $ref: '#/components/schemas/ToolParameter' metadata: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - identifier - provider_resource_id - provider_id - type - toolgroup_id - tool_host - description - parameters ToolHost: type: string enum: - distribution - client - model_context_protocol ToolGroup: type: object properties: identifier: type: string provider_resource_id: type: string provider_id: type: string type: type: string const: tool_group default: tool_group mcp_endpoint: $ref: '#/components/schemas/URL' args: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - identifier - provider_resource_id - provider_id - type Trace: type: object properties: trace_id: type: string root_span_id: type: string start_time: type: string format: date-time end_time: type: string format: date-time additionalProperties: false required: - trace_id - root_span_id - start_time Checkpoint: description: Checkpoint created during training runs PostTrainingJobArtifactsResponse: type: object properties: job_uuid: type: string checkpoints: type: array items: $ref: '#/components/schemas/Checkpoint' additionalProperties: false required: - job_uuid - checkpoints title: Artifacts of a finetuning job. JobStatus: type: string enum: - completed - in_progress - failed - scheduled PostTrainingJobStatusResponse: type: object properties: job_uuid: type: string status: $ref: '#/components/schemas/JobStatus' scheduled_at: type: string format: date-time started_at: type: string format: date-time completed_at: type: string format: date-time resources_allocated: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object checkpoints: type: array items: $ref: '#/components/schemas/Checkpoint' additionalProperties: false required: - job_uuid - status - checkpoints title: Status of a finetuning job. ListPostTrainingJobsResponse: type: object properties: data: type: array items: type: object properties: job_uuid: type: string additionalProperties: false required: - job_uuid additionalProperties: false required: - data VectorDB: type: object properties: identifier: type: string provider_resource_id: type: string provider_id: type: string type: type: string const: vector_db default: vector_db embedding_model: type: string embedding_dimension: type: integer additionalProperties: false required: - identifier - provider_resource_id - provider_id - type - embedding_model - embedding_dimension HealthInfo: type: object properties: status: type: string additionalProperties: false required: - status RAGDocument: type: object properties: document_id: type: string content: oneOf: - type: string - $ref: '#/components/schemas/InterleavedContentItem' - type: array items: $ref: '#/components/schemas/InterleavedContentItem' - $ref: '#/components/schemas/URL' mime_type: type: string metadata: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - document_id - content - metadata InsertRequest: type: object properties: documents: type: array items: $ref: '#/components/schemas/RAGDocument' vector_db_id: type: string chunk_size_in_tokens: type: integer additionalProperties: false required: - documents - vector_db_id - chunk_size_in_tokens InsertChunksRequest: type: object properties: vector_db_id: type: string chunks: type: array items: type: object properties: content: $ref: '#/components/schemas/InterleavedContent' metadata: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - content - metadata ttl_seconds: type: integer additionalProperties: false required: - vector_db_id - chunks InvokeToolRequest: type: object properties: tool_name: type: string kwargs: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - tool_name - kwargs ToolInvocationResult: type: object properties: content: $ref: '#/components/schemas/InterleavedContent' error_message: type: string error_code: type: integer additionalProperties: false required: - content ListDatasetsResponse: type: object properties: data: type: array items: $ref: '#/components/schemas/Dataset' additionalProperties: false required: - data ListEvalTasksResponse: type: object properties: data: type: array items: $ref: '#/components/schemas/EvalTask' additionalProperties: false required: - data ListModelsResponse: type: object properties: data: type: array items: $ref: '#/components/schemas/Model' additionalProperties: false required: - data ProviderInfo: type: object properties: api: type: string provider_id: type: string provider_type: type: string additionalProperties: false required: - api - provider_id - provider_type ListProvidersResponse: type: object properties: data: type: array items: $ref: '#/components/schemas/ProviderInfo' additionalProperties: false required: - data RouteInfo: type: object properties: route: type: string method: type: string provider_types: type: array items: type: string additionalProperties: false required: - route - method - provider_types ListRoutesResponse: type: object properties: data: type: array items: $ref: '#/components/schemas/RouteInfo' additionalProperties: false required: - data ListScoringFunctionsResponse: type: object properties: data: type: array items: $ref: '#/components/schemas/ScoringFn' additionalProperties: false required: - data ListShieldsResponse: type: object properties: data: type: array items: $ref: '#/components/schemas/Shield' additionalProperties: false required: - data ListToolGroupsResponse: type: object properties: data: type: array items: $ref: '#/components/schemas/ToolGroup' additionalProperties: false required: - data ListToolsResponse: type: object properties: data: type: array items: $ref: '#/components/schemas/Tool' additionalProperties: false required: - data ListVectorDBsResponse: type: object properties: data: type: array items: $ref: '#/components/schemas/VectorDB' additionalProperties: false required: - data Event: oneOf: - $ref: '#/components/schemas/UnstructuredLogEvent' - $ref: '#/components/schemas/MetricEvent' - $ref: '#/components/schemas/StructuredLogEvent' discriminator: propertyName: type mapping: unstructured_log: '#/components/schemas/UnstructuredLogEvent' metric: '#/components/schemas/MetricEvent' structured_log: '#/components/schemas/StructuredLogEvent' LogSeverity: type: string enum: - verbose - debug - info - warn - error - critical MetricEvent: type: object properties: trace_id: type: string span_id: type: string timestamp: type: string format: date-time attributes: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object type: type: string const: metric default: metric metric: type: string value: oneOf: - type: integer - type: number unit: type: string additionalProperties: false required: - trace_id - span_id - timestamp - type - metric - value - unit SpanEndPayload: type: object properties: type: type: string const: span_end default: span_end status: $ref: '#/components/schemas/SpanStatus' additionalProperties: false required: - type - status SpanStartPayload: type: object properties: type: type: string const: span_start default: span_start name: type: string parent_span_id: type: string additionalProperties: false required: - type - name StructuredLogEvent: type: object properties: trace_id: type: string span_id: type: string timestamp: type: string format: date-time attributes: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object type: type: string const: structured_log default: structured_log payload: $ref: '#/components/schemas/StructuredLogPayload' additionalProperties: false required: - trace_id - span_id - timestamp - type - payload StructuredLogPayload: oneOf: - $ref: '#/components/schemas/SpanStartPayload' - $ref: '#/components/schemas/SpanEndPayload' discriminator: propertyName: type mapping: span_start: '#/components/schemas/SpanStartPayload' span_end: '#/components/schemas/SpanEndPayload' UnstructuredLogEvent: type: object properties: trace_id: type: string span_id: type: string timestamp: type: string format: date-time attributes: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object type: type: string const: unstructured_log default: unstructured_log message: type: string severity: $ref: '#/components/schemas/LogSeverity' additionalProperties: false required: - trace_id - span_id - timestamp - type - message - severity LogEventRequest: type: object properties: event: $ref: '#/components/schemas/Event' ttl_seconds: type: integer additionalProperties: false required: - event - ttl_seconds DPOAlignmentConfig: type: object properties: reward_scale: type: number reward_clip: type: number epsilon: type: number gamma: type: number additionalProperties: false required: - reward_scale - reward_clip - epsilon - gamma DataConfig: type: object properties: dataset_id: type: string batch_size: type: integer shuffle: type: boolean data_format: $ref: '#/components/schemas/DatasetFormat' validation_dataset_id: type: string packed: type: boolean default: false train_on_input: type: boolean default: false additionalProperties: false required: - dataset_id - batch_size - shuffle - data_format DatasetFormat: type: string enum: - instruct - dialog EfficiencyConfig: type: object properties: enable_activation_checkpointing: type: boolean default: false enable_activation_offloading: type: boolean default: false memory_efficient_fsdp_wrap: type: boolean default: false fsdp_cpu_offload: type: boolean default: false additionalProperties: false OptimizerConfig: type: object properties: optimizer_type: $ref: '#/components/schemas/OptimizerType' lr: type: number weight_decay: type: number num_warmup_steps: type: integer additionalProperties: false required: - optimizer_type - lr - weight_decay - num_warmup_steps OptimizerType: type: string enum: - adam - adamw - sgd TrainingConfig: type: object properties: n_epochs: type: integer max_steps_per_epoch: type: integer gradient_accumulation_steps: type: integer max_validation_steps: type: integer data_config: $ref: '#/components/schemas/DataConfig' optimizer_config: $ref: '#/components/schemas/OptimizerConfig' efficiency_config: $ref: '#/components/schemas/EfficiencyConfig' dtype: type: string default: bf16 additionalProperties: false required: - n_epochs - max_steps_per_epoch - gradient_accumulation_steps - max_validation_steps - data_config - optimizer_config PreferenceOptimizeRequest: type: object properties: job_uuid: type: string finetuned_model: type: string algorithm_config: $ref: '#/components/schemas/DPOAlignmentConfig' training_config: $ref: '#/components/schemas/TrainingConfig' hyperparam_search_config: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object logger_config: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - job_uuid - finetuned_model - algorithm_config - training_config - hyperparam_search_config - logger_config PostTrainingJob: type: object properties: job_uuid: type: string additionalProperties: false required: - job_uuid DefaultRAGQueryGeneratorConfig: type: object properties: type: type: string const: default default: default separator: type: string default: ' ' additionalProperties: false required: - type - separator LLMRAGQueryGeneratorConfig: type: object properties: type: type: string const: llm default: llm model: type: string template: type: string additionalProperties: false required: - type - model - template RAGQueryConfig: type: object properties: query_generator_config: $ref: '#/components/schemas/RAGQueryGeneratorConfig' max_tokens_in_context: type: integer default: 4096 max_chunks: type: integer default: 5 additionalProperties: false required: - query_generator_config - max_tokens_in_context - max_chunks RAGQueryGeneratorConfig: oneOf: - $ref: '#/components/schemas/DefaultRAGQueryGeneratorConfig' - $ref: '#/components/schemas/LLMRAGQueryGeneratorConfig' discriminator: propertyName: type mapping: default: '#/components/schemas/DefaultRAGQueryGeneratorConfig' llm: '#/components/schemas/LLMRAGQueryGeneratorConfig' QueryRequest: type: object properties: content: $ref: '#/components/schemas/InterleavedContent' vector_db_ids: type: array items: type: string query_config: $ref: '#/components/schemas/RAGQueryConfig' additionalProperties: false required: - content - vector_db_ids RAGQueryResult: type: object properties: content: $ref: '#/components/schemas/InterleavedContent' additionalProperties: false QueryChunksRequest: type: object properties: vector_db_id: type: string query: $ref: '#/components/schemas/InterleavedContent' params: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - vector_db_id - query QueryChunksResponse: type: object properties: chunks: type: array items: type: object properties: content: $ref: '#/components/schemas/InterleavedContent' metadata: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - content - metadata scores: type: array items: type: number additionalProperties: false required: - chunks - scores QueryCondition: type: object properties: key: type: string op: $ref: '#/components/schemas/QueryConditionOp' value: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - key - op - value QueryConditionOp: type: string enum: - eq - ne - gt - lt QuerySpansResponse: type: object properties: data: type: array items: $ref: '#/components/schemas/Span' additionalProperties: false required: - data QueryTracesResponse: type: object properties: data: type: array items: $ref: '#/components/schemas/Trace' additionalProperties: false required: - data RegisterDatasetRequest: type: object properties: dataset_id: type: string dataset_schema: type: object additionalProperties: $ref: '#/components/schemas/ParamType' url: $ref: '#/components/schemas/URL' provider_dataset_id: type: string provider_id: type: string metadata: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - dataset_id - dataset_schema - url RegisterEvalTaskRequest: type: object properties: eval_task_id: type: string dataset_id: type: string scoring_functions: type: array items: type: string provider_eval_task_id: type: string provider_id: type: string metadata: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - eval_task_id - dataset_id - scoring_functions RegisterModelRequest: type: object properties: model_id: type: string provider_model_id: type: string provider_id: type: string metadata: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object model_type: $ref: '#/components/schemas/ModelType' additionalProperties: false required: - model_id RegisterScoringFunctionRequest: type: object properties: scoring_fn_id: type: string description: type: string return_type: $ref: '#/components/schemas/ParamType' provider_scoring_fn_id: type: string provider_id: type: string params: $ref: '#/components/schemas/ScoringFnParams' additionalProperties: false required: - scoring_fn_id - description - return_type RegisterShieldRequest: type: object properties: shield_id: type: string provider_shield_id: type: string provider_id: type: string params: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - shield_id RegisterToolGroupRequest: type: object properties: toolgroup_id: type: string provider_id: type: string mcp_endpoint: $ref: '#/components/schemas/URL' args: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - toolgroup_id - provider_id RegisterVectorDbRequest: type: object properties: vector_db_id: type: string embedding_model: type: string embedding_dimension: type: integer provider_id: type: string provider_vector_db_id: type: string additionalProperties: false required: - vector_db_id - embedding_model RunEvalRequest: type: object properties: task_config: $ref: '#/components/schemas/EvalTaskConfig' additionalProperties: false required: - task_config Job: type: object properties: job_id: type: string additionalProperties: false required: - job_id RunShieldRequest: type: object properties: shield_id: type: string messages: type: array items: $ref: '#/components/schemas/Message' params: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - shield_id - messages - params RunShieldResponse: type: object properties: violation: $ref: '#/components/schemas/SafetyViolation' additionalProperties: false SaveSpansToDatasetRequest: type: object properties: attribute_filters: type: array items: $ref: '#/components/schemas/QueryCondition' attributes_to_save: type: array items: type: string dataset_id: type: string max_depth: type: integer additionalProperties: false required: - attribute_filters - attributes_to_save - dataset_id ScoreRequest: type: object properties: input_rows: type: array items: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object scoring_functions: type: object additionalProperties: oneOf: - $ref: '#/components/schemas/ScoringFnParams' - type: 'null' additionalProperties: false required: - input_rows - scoring_functions ScoreResponse: type: object properties: results: type: object additionalProperties: $ref: '#/components/schemas/ScoringResult' additionalProperties: false required: - results ScoreBatchRequest: type: object properties: dataset_id: type: string scoring_functions: type: object additionalProperties: oneOf: - $ref: '#/components/schemas/ScoringFnParams' - type: 'null' save_results_dataset: type: boolean additionalProperties: false required: - dataset_id - scoring_functions - save_results_dataset ScoreBatchResponse: type: object properties: dataset_id: type: string results: type: object additionalProperties: $ref: '#/components/schemas/ScoringResult' additionalProperties: false required: - results AlgorithmConfig: oneOf: - $ref: '#/components/schemas/LoraFinetuningConfig' - $ref: '#/components/schemas/QATFinetuningConfig' discriminator: propertyName: type mapping: LoRA: '#/components/schemas/LoraFinetuningConfig' QAT: '#/components/schemas/QATFinetuningConfig' LoraFinetuningConfig: type: object properties: type: type: string const: LoRA default: LoRA lora_attn_modules: type: array items: type: string apply_lora_to_mlp: type: boolean apply_lora_to_output: type: boolean rank: type: integer alpha: type: integer use_dora: type: boolean default: false quantize_base: type: boolean default: false additionalProperties: false required: - type - lora_attn_modules - apply_lora_to_mlp - apply_lora_to_output - rank - alpha QATFinetuningConfig: type: object properties: type: type: string const: QAT default: QAT quantizer_name: type: string group_size: type: integer additionalProperties: false required: - type - quantizer_name - group_size SupervisedFineTuneRequest: type: object properties: job_uuid: type: string training_config: $ref: '#/components/schemas/TrainingConfig' hyperparam_search_config: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object logger_config: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object model: type: string checkpoint_dir: type: string algorithm_config: $ref: '#/components/schemas/AlgorithmConfig' additionalProperties: false required: - job_uuid - training_config - hyperparam_search_config - logger_config - model SyntheticDataGenerateRequest: type: object properties: dialogs: type: array items: $ref: '#/components/schemas/Message' filtering_function: type: string enum: - none - random - top_k - top_p - top_k_top_p - sigmoid title: The type of filtering function. model: type: string additionalProperties: false required: - dialogs - filtering_function SyntheticDataGenerationResponse: type: object properties: synthetic_data: type: array items: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object statistics: type: object additionalProperties: oneOf: - type: 'null' - type: boolean - type: number - type: string - type: array - type: object additionalProperties: false required: - synthetic_data title: >- Response from the synthetic data generation. Batch of (prompt, response, score) tuples that pass the threshold. VersionInfo: type: object properties: version: type: string additionalProperties: false required: - version responses: {} security: - Default: [] tags: - name: Agents description: >- Main functionalities provided by this API: - Create agents with specific instructions and ability to use tools. - Interactions with agents are grouped into sessions ("threads"), and each interaction is called a "turn". - Agents can be provided with various tools (see the ToolGroups and ToolRuntime APIs for more details). - Agents can be provided with various shields (see the Safety API for more details). - Agents can also use Memory to retrieve information from knowledge bases. See the RAG Tool and Vector IO APIs for more details. x-displayName: >- Agents API for creating and interacting with agentic systems. - name: BatchInference (Coming Soon) - name: DatasetIO - name: Datasets - name: Eval - name: EvalTasks - name: Inference description: >- This API provides the raw interface to the underlying models. Two kinds of models are supported: - LLM models: these models generate "raw" and "chat" (conversational) completions. - Embedding models: these models generate embeddings to be used for semantic search. x-displayName: >- Llama Stack Inference API for generating completions, chat completions, and embeddings. - name: Inspect - name: Models - name: PostTraining (Coming Soon) - name: Safety - name: Scoring - name: ScoringFunctions - name: Shields - name: SyntheticDataGeneration (Coming Soon) - name: Telemetry - name: ToolGroups - name: ToolRuntime - name: VectorDBs - name: VectorIO x-tagGroups: - name: Operations tags: - Agents - BatchInference (Coming Soon) - DatasetIO - Datasets - Eval - EvalTasks - Inference - Inspect - Models - PostTraining (Coming Soon) - Safety - Scoring - ScoringFunctions - Shields - SyntheticDataGeneration (Coming Soon) - Telemetry - ToolGroups - ToolRuntime - VectorDBs - VectorIO