Update index.md

2025-10-03 19:57:35 +00:00 · 2025-09-10 11:39:39 -07:00 · 2025-09-10 11:39:39 -07:00 · 78375889ec
commit 78375889ec
parent f66718be80
2 changed files with 3 additions and 2 deletions
--- a/docs/docs/providers/inference/index.mdx
+++ b/docs/docs/providers/inference/index.mdx
@ -18,6 +18,6 @@ Llama Stack Inference API for generating completions, chat completions, and embe
    This API provides the raw interface to the underlying models. Three kinds of models are supported:
    - LLM models: these models generate "raw" and "chat" (conversational) completions.
    - Embedding models: these models generate embeddings to be used for semantic search.
-    - Rerank models: these models rerank the documents by relevance.
+    - Rerank models: these models reorder the documents by relevance.
 This section contains documentation for all available providers for the **inference** API.
--- a/llama_stack/apis/inference/inference.py
+++ b/llama_stack/apis/inference/inference.py
@ -1159,9 +1159,10 @@ class InferenceProvider(Protocol):
 class Inference(InferenceProvider):
    """Llama Stack Inference API for generating completions, chat completions, and embeddings.
-    This API provides the raw interface to the underlying models. Two kinds of models are supported:
+    This API provides the raw interface to the underlying models. Three kinds of models are supported:
    - LLM models: these models generate "raw" and "chat" (conversational) completions.
    - Embedding models: these models generate embeddings to be used for semantic search.
    - Rerank models: these models reorder the documents by relevance.
    """
    @webmethod(route="/openai/v1/chat/completions", method="GET", level=LLAMA_STACK_API_V1, deprecated=True)