Add rerank models to the dynamic model list; Fix integration tests

2025-10-09 13:14:39 +00:00 · 2025-09-28 14:45:16 -07:00 · 2025-09-28 14:45:16 -07:00 · 816b68fdc7
commit 816b68fdc7
parent 3538477070
8 changed files with 247 additions and 25 deletions
--- a/docs/docs/providers/batches/index.mdx
+++ b/docs/docs/providers/batches/index.mdx
@ -18,14 +18,14 @@ title: Batches
 ## Overview

 The Batches API enables efficient processing of multiple requests in a single operation,
-    particularly useful for processing large datasets, batch evaluation workflows, and
-    cost-effective inference at scale.
+particularly useful for processing large datasets, batch evaluation workflows, and
+cost-effective inference at scale.

-    The API is designed to allow use of openai client libraries for seamless integration.
+The API is designed to allow use of openai client libraries for seamless integration.

-    This API provides the following extensions:
-     - idempotent batch creation
+This API provides the following extensions:
+ - idempotent batch creation

-    Note: This API is currently under active development and may undergo changes.
+Note: This API is currently under active development and may undergo changes.

 This section contains documentation for all available providers for the **batches** API.
--- a/docs/docs/providers/inference/index.mdx
+++ b/docs/docs/providers/inference/index.mdx
@ -5,6 +5,7 @@ description: "Llama Stack Inference API for generating completions, chat complet
    - LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.
    - Embedding models: these models generate embeddings to be used for semantic search.
    - Rerank models: these models rerank the documents by relevance."
+
 sidebar_label: Inference
 title: Inference
 ---