Remove experimental from rerank models doc

This commit is contained in:
Jiayi 2025-10-17 14:51:17 -07:00
parent 51c923f096
commit ad52849072
9 changed files with 13 additions and 12 deletions

View file

@ -6,7 +6,7 @@ description: "Inference
This API provides the raw interface to the underlying models. Three kinds of models are supported: This API provides the raw interface to the underlying models. Three kinds of models are supported:
- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions. - LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.
- Embedding models: these models generate embeddings to be used for semantic search. - Embedding models: these models generate embeddings to be used for semantic search.
- Rerank models (Experimental): these models reorder the documents based on their relevance to a query." - Rerank models: these models reorder the documents based on their relevance to a query."
sidebar_label: Inference sidebar_label: Inference
title: Inference title: Inference
--- ---
@ -22,6 +22,6 @@ Inference
This API provides the raw interface to the underlying models. Three kinds of models are supported: This API provides the raw interface to the underlying models. Three kinds of models are supported:
- LLM models: these models generate "raw" and "chat" (conversational) completions. - LLM models: these models generate "raw" and "chat" (conversational) completions.
- Embedding models: these models generate embeddings to be used for semantic search. - Embedding models: these models generate embeddings to be used for semantic search.
- Rerank models (Experimental): these models reorder the documents based on their relevance to a query. - Rerank models: these models reorder the documents based on their relevance to a query.
This section contains documentation for all available providers for the **inference** API. This section contains documentation for all available providers for the **inference** API.

View file

@ -13459,7 +13459,7 @@
}, },
{ {
"name": "Inference", "name": "Inference",
"description": "Llama Stack Inference API for generating completions, chat completions, and embeddings.\n\nThis API provides the raw interface to the underlying models. Three kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.\n- Rerank models (Experimental): these models reorder the documents based on their relevance to a query.", "description": "Llama Stack Inference API for generating completions, chat completions, and embeddings.\n\nThis API provides the raw interface to the underlying models. Three kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.\n- Rerank models: these models reorder the documents based on their relevance to a query.",
"x-displayName": "Inference" "x-displayName": "Inference"
}, },
{ {

View file

@ -10218,8 +10218,8 @@ tags:
- Embedding models: these models generate embeddings to be used for semantic - Embedding models: these models generate embeddings to be used for semantic
search. search.
- Rerank models (Experimental): these models reorder the documents based on - Rerank models: these models reorder the documents based on their relevance
their relevance to a query. to a query.
x-displayName: Inference x-displayName: Inference
- name: Models - name: Models
description: '' description: ''

View file

@ -13262,7 +13262,7 @@
}, },
{ {
"name": "Inference", "name": "Inference",
"description": "Llama Stack Inference API for generating completions, chat completions, and embeddings.\n\nThis API provides the raw interface to the underlying models. Three kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.\n- Rerank models (Experimental): these models reorder the documents based on their relevance to a query.", "description": "Llama Stack Inference API for generating completions, chat completions, and embeddings.\n\nThis API provides the raw interface to the underlying models. Three kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.\n- Rerank models: these models reorder the documents based on their relevance to a query.",
"x-displayName": "Inference" "x-displayName": "Inference"
}, },
{ {

View file

@ -10191,8 +10191,8 @@ tags:
- Embedding models: these models generate embeddings to be used for semantic - Embedding models: these models generate embeddings to be used for semantic
search. search.
- Rerank models (Experimental): these models reorder the documents based on - Rerank models: these models reorder the documents based on their relevance
their relevance to a query. to a query.
x-displayName: Inference x-displayName: Inference
- name: Inspect - name: Inspect
description: >- description: >-

View file

@ -17952,7 +17952,7 @@
}, },
{ {
"name": "Inference", "name": "Inference",
"description": "Llama Stack Inference API for generating completions, chat completions, and embeddings.\n\nThis API provides the raw interface to the underlying models. Three kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.\n- Rerank models (Experimental): these models reorder the documents based on their relevance to a query.", "description": "Llama Stack Inference API for generating completions, chat completions, and embeddings.\n\nThis API provides the raw interface to the underlying models. Three kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.\n- Rerank models: these models reorder the documents based on their relevance to a query.",
"x-displayName": "Inference" "x-displayName": "Inference"
}, },
{ {

View file

@ -13586,8 +13586,8 @@ tags:
- Embedding models: these models generate embeddings to be used for semantic - Embedding models: these models generate embeddings to be used for semantic
search. search.
- Rerank models (Experimental): these models reorder the documents based on - Rerank models: these models reorder the documents based on their relevance
their relevance to a query. to a query.
x-displayName: Inference x-displayName: Inference
- name: Inspect - name: Inspect
description: >- description: >-

View file

@ -1237,7 +1237,7 @@ class Inference(InferenceProvider):
This API provides the raw interface to the underlying models. Three kinds of models are supported: This API provides the raw interface to the underlying models. Three kinds of models are supported:
- LLM models: these models generate "raw" and "chat" (conversational) completions. - LLM models: these models generate "raw" and "chat" (conversational) completions.
- Embedding models: these models generate embeddings to be used for semantic search. - Embedding models: these models generate embeddings to be used for semantic search.
- Rerank models (Experimental): these models reorder the documents based on their relevance to a query. - Rerank models: these models reorder the documents based on their relevance to a query.
""" """
@webmethod(route="/openai/v1/chat/completions", method="GET", level=LLAMA_STACK_API_V1, deprecated=True) @webmethod(route="/openai/v1/chat/completions", method="GET", level=LLAMA_STACK_API_V1, deprecated=True)

View file

@ -48,6 +48,7 @@ class OpenAIMixin(NeedsRequestProviderData, ABC, BaseModel):
- overwrite_completion_id: If True, overwrites the 'id' field in OpenAI responses - overwrite_completion_id: If True, overwrites the 'id' field in OpenAI responses
- download_images: If True, downloads images and converts to base64 for providers that require it - download_images: If True, downloads images and converts to base64 for providers that require it
- embedding_model_metadata: A dictionary mapping model IDs to their embedding metadata - embedding_model_metadata: A dictionary mapping model IDs to their embedding metadata
- rerank_model_list: A list of model IDs for rerank models
- provider_data_api_key_field: Optional field name in provider data to look for API key - provider_data_api_key_field: Optional field name in provider data to look for API key
- list_provider_model_ids: Method to list available models from the provider - list_provider_model_ids: Method to list available models from the provider
- get_extra_client_params: Method to provide extra parameters to the AsyncOpenAI client - get_extra_client_params: Method to provide extra parameters to the AsyncOpenAI client