Add rerank models and rerank API change

2025-12-12 12:06:04 +00:00 · 2025-10-16 17:27:38 -07:00 · 2025-10-16 17:27:38 -07:00 · 51c923f096
commit 51c923f096
parent f675fdda0f
12 changed files with 215 additions and 28 deletions
--- a/docs/static/llama-stack-spec.yaml
+++ b/docs/static/llama-stack-spec.yaml
@ -5269,6 +5269,7 @@ components:
      enum:
        - llm
        - embedding
+        - rerank
      title: ModelType
      description: >-
        Enumeration of supported model types in Llama Stack.
@ -10182,13 +10183,16 @@ tags:
      embeddings.


-      This API provides the raw interface to the underlying models. Two kinds of models
-      are supported:
+      This API provides the raw interface to the underlying models. Three kinds of
+      models are supported:

      - LLM models: these models generate "raw" and "chat" (conversational) completions.

      - Embedding models: these models generate embeddings to be used for semantic
      search.
+
+      - Rerank models (Experimental): these models reorder the documents based on
+      their relevance to a query.
    x-displayName: Inference
  - name: Inspect
    description: >-