Merge 0c2b82b30a into ee107aadd6

2025-12-03 01:48:05 +00:00 · 2025-12-02 09:58:57 +01:00 · 2025-12-02 09:58:57 +01:00 · e484c29d45
commit e484c29d45
parent ee107aadd6 0c2b82b30a
7 changed files with 31 additions and 30 deletions
--- a/docs/docs/providers/agents/index.mdx
+++ b/docs/docs/providers/agents/index.mdx
@ -2,7 +2,7 @@
 description: |
  Agents
-      APIs for creating and interacting with agentic systems.
+  APIs for creating and interacting with agentic systems.
 sidebar_label: Agents
 title: Agents
 ---
@ -13,6 +13,6 @@ title: Agents
 Agents
-    APIs for creating and interacting with agentic systems.
+APIs for creating and interacting with agentic systems.
 This section contains documentation for all available providers for the **agents** API.
--- a/docs/docs/providers/batches/index.mdx
+++ b/docs/docs/providers/batches/index.mdx
@ -1,15 +1,15 @@
 ---
 description: |
  The Batches API enables efficient processing of multiple requests in a single operation,
-      particularly useful for processing large datasets, batch evaluation workflows, and
+  particularly useful for processing large datasets, batch evaluation workflows, and
-      cost-effective inference at scale.
+  cost-effective inference at scale.
-      The API is designed to allow use of openai client libraries for seamless integration.
+  The API is designed to allow use of openai client libraries for seamless integration.
-      This API provides the following extensions:
+  This API provides the following extensions:
-       - idempotent batch creation
+   - idempotent batch creation
-      Note: This API is currently under active development and may undergo changes.
+  Note: This API is currently under active development and may undergo changes.
 sidebar_label: Batches
 title: Batches
 ---
@ -19,14 +19,14 @@ title: Batches
 ## Overview
 The Batches API enables efficient processing of multiple requests in a single operation,
-    particularly useful for processing large datasets, batch evaluation workflows, and
+particularly useful for processing large datasets, batch evaluation workflows, and
-    cost-effective inference at scale.
+cost-effective inference at scale.
-    The API is designed to allow use of openai client libraries for seamless integration.
+The API is designed to allow use of openai client libraries for seamless integration.
-    This API provides the following extensions:
+This API provides the following extensions:
-     - idempotent batch creation
+ - idempotent batch creation
-    Note: This API is currently under active development and may undergo changes.
+Note: This API is currently under active development and may undergo changes.
 This section contains documentation for all available providers for the **batches** API.
--- a/docs/docs/providers/eval/index.mdx
+++ b/docs/docs/providers/eval/index.mdx
@ -2,7 +2,7 @@
 description: |
  Evaluations
-      Llama Stack Evaluation API for running evaluations on model and agent candidates.
+  Llama Stack Evaluation API for running evaluations on model and agent candidates.
 sidebar_label: Eval
 title: Eval
 ---
@ -13,6 +13,6 @@ title: Eval
 Evaluations
-    Llama Stack Evaluation API for running evaluations on model and agent candidates.
+Llama Stack Evaluation API for running evaluations on model and agent candidates.
 This section contains documentation for all available providers for the **eval** API.
--- a/docs/docs/providers/files/index.mdx
+++ b/docs/docs/providers/files/index.mdx
@ -2,7 +2,7 @@
 description: |
  Files
-      This API is used to upload documents that can be used with other Llama Stack APIs.
+  This API is used to upload documents that can be used with other Llama Stack APIs.
 sidebar_label: Files
 title: Files
 ---
@ -13,6 +13,6 @@ title: Files
 Files
-    This API is used to upload documents that can be used with other Llama Stack APIs.
+This API is used to upload documents that can be used with other Llama Stack APIs.
 This section contains documentation for all available providers for the **files** API.
--- a/docs/docs/providers/inference/index.mdx
+++ b/docs/docs/providers/inference/index.mdx
@ -2,12 +2,12 @@
 description: |
  Inference
-      Llama Stack Inference API for generating completions, chat completions, and embeddings.
+  Llama Stack Inference API for generating completions, chat completions, and embeddings.
-      This API provides the raw interface to the underlying models. Three kinds of models are supported:
+  This API provides the raw interface to the underlying models. Three kinds of models are supported:
-      - LLM models: these models generate "raw" and "chat" (conversational) completions.
+  - LLM models: these models generate "raw" and "chat" (conversational) completions.
-      - Embedding models: these models generate embeddings to be used for semantic search.
+  - Embedding models: these models generate embeddings to be used for semantic search.
-      - Rerank models: these models reorder the documents based on their relevance to a query.
+  - Rerank models: these models reorder the documents based on their relevance to a query.
 sidebar_label: Inference
 title: Inference
 ---
@ -18,11 +18,11 @@ title: Inference
 Inference
-    Llama Stack Inference API for generating completions, chat completions, and embeddings.
+Llama Stack Inference API for generating completions, chat completions, and embeddings.
-    This API provides the raw interface to the underlying models. Three kinds of models are supported:
+This API provides the raw interface to the underlying models. Three kinds of models are supported:
-    - LLM models: these models generate "raw" and "chat" (conversational) completions.
+- LLM models: these models generate "raw" and "chat" (conversational) completions.
-    - Embedding models: these models generate embeddings to be used for semantic search.
+- Embedding models: these models generate embeddings to be used for semantic search.
-    - Rerank models: these models reorder the documents based on their relevance to a query.
+- Rerank models: these models reorder the documents based on their relevance to a query.
 This section contains documentation for all available providers for the **inference** API.
--- a/docs/docs/providers/safety/index.mdx
+++ b/docs/docs/providers/safety/index.mdx
@ -2,7 +2,7 @@
 description: |
  Safety
-      OpenAI-compatible Moderations API.
+  OpenAI-compatible Moderations API.
 sidebar_label: Safety
 title: Safety
 ---
@ -13,6 +13,6 @@ title: Safety
 Safety
-    OpenAI-compatible Moderations API.
+OpenAI-compatible Moderations API.
 This section contains documentation for all available providers for the **safety** API.
--- a/src/llama_stack/providers/inline/agents/meta_reference/responses/streaming.py
+++ b/src/llama_stack/providers/inline/agents/meta_reference/responses/streaming.py
@ -250,6 +250,7 @@ class StreamingResponseOrchestrator:
                    messages=messages,
                    # Pydantic models are dict-compatible but mypy treats them as distinct types
                    tools=self.ctx.chat_tools,  # type: ignore[arg-type]
                    parallel_tool_calls=self.parallel_tool_calls,
                    stream=True,
                    temperature=self.ctx.temperature,
                    response_format=response_format,