mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-12 04:00:42 +00:00
Remove experimental from rerank models doc
This commit is contained in:
parent
51c923f096
commit
ad52849072
9 changed files with 13 additions and 12 deletions
|
|
@ -6,7 +6,7 @@ description: "Inference
|
|||
This API provides the raw interface to the underlying models. Three kinds of models are supported:
|
||||
- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.
|
||||
- Embedding models: these models generate embeddings to be used for semantic search.
|
||||
- Rerank models (Experimental): these models reorder the documents based on their relevance to a query."
|
||||
- Rerank models: these models reorder the documents based on their relevance to a query."
|
||||
sidebar_label: Inference
|
||||
title: Inference
|
||||
---
|
||||
|
|
@ -22,6 +22,6 @@ Inference
|
|||
This API provides the raw interface to the underlying models. Three kinds of models are supported:
|
||||
- LLM models: these models generate "raw" and "chat" (conversational) completions.
|
||||
- Embedding models: these models generate embeddings to be used for semantic search.
|
||||
- Rerank models (Experimental): these models reorder the documents based on their relevance to a query.
|
||||
- Rerank models: these models reorder the documents based on their relevance to a query.
|
||||
|
||||
This section contains documentation for all available providers for the **inference** API.
|
||||
|
|
|
|||
2
docs/static/deprecated-llama-stack-spec.html
vendored
2
docs/static/deprecated-llama-stack-spec.html
vendored
|
|
@ -13459,7 +13459,7 @@
|
|||
},
|
||||
{
|
||||
"name": "Inference",
|
||||
"description": "Llama Stack Inference API for generating completions, chat completions, and embeddings.\n\nThis API provides the raw interface to the underlying models. Three kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.\n- Rerank models (Experimental): these models reorder the documents based on their relevance to a query.",
|
||||
"description": "Llama Stack Inference API for generating completions, chat completions, and embeddings.\n\nThis API provides the raw interface to the underlying models. Three kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.\n- Rerank models: these models reorder the documents based on their relevance to a query.",
|
||||
"x-displayName": "Inference"
|
||||
},
|
||||
{
|
||||
|
|
|
|||
4
docs/static/deprecated-llama-stack-spec.yaml
vendored
4
docs/static/deprecated-llama-stack-spec.yaml
vendored
|
|
@ -10218,8 +10218,8 @@ tags:
|
|||
- Embedding models: these models generate embeddings to be used for semantic
|
||||
search.
|
||||
|
||||
- Rerank models (Experimental): these models reorder the documents based on
|
||||
their relevance to a query.
|
||||
- Rerank models: these models reorder the documents based on their relevance
|
||||
to a query.
|
||||
x-displayName: Inference
|
||||
- name: Models
|
||||
description: ''
|
||||
|
|
|
|||
2
docs/static/llama-stack-spec.html
vendored
2
docs/static/llama-stack-spec.html
vendored
|
|
@ -13262,7 +13262,7 @@
|
|||
},
|
||||
{
|
||||
"name": "Inference",
|
||||
"description": "Llama Stack Inference API for generating completions, chat completions, and embeddings.\n\nThis API provides the raw interface to the underlying models. Three kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.\n- Rerank models (Experimental): these models reorder the documents based on their relevance to a query.",
|
||||
"description": "Llama Stack Inference API for generating completions, chat completions, and embeddings.\n\nThis API provides the raw interface to the underlying models. Three kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.\n- Rerank models: these models reorder the documents based on their relevance to a query.",
|
||||
"x-displayName": "Inference"
|
||||
},
|
||||
{
|
||||
|
|
|
|||
4
docs/static/llama-stack-spec.yaml
vendored
4
docs/static/llama-stack-spec.yaml
vendored
|
|
@ -10191,8 +10191,8 @@ tags:
|
|||
- Embedding models: these models generate embeddings to be used for semantic
|
||||
search.
|
||||
|
||||
- Rerank models (Experimental): these models reorder the documents based on
|
||||
their relevance to a query.
|
||||
- Rerank models: these models reorder the documents based on their relevance
|
||||
to a query.
|
||||
x-displayName: Inference
|
||||
- name: Inspect
|
||||
description: >-
|
||||
|
|
|
|||
2
docs/static/stainless-llama-stack-spec.html
vendored
2
docs/static/stainless-llama-stack-spec.html
vendored
|
|
@ -17952,7 +17952,7 @@
|
|||
},
|
||||
{
|
||||
"name": "Inference",
|
||||
"description": "Llama Stack Inference API for generating completions, chat completions, and embeddings.\n\nThis API provides the raw interface to the underlying models. Three kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.\n- Rerank models (Experimental): these models reorder the documents based on their relevance to a query.",
|
||||
"description": "Llama Stack Inference API for generating completions, chat completions, and embeddings.\n\nThis API provides the raw interface to the underlying models. Three kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.\n- Rerank models: these models reorder the documents based on their relevance to a query.",
|
||||
"x-displayName": "Inference"
|
||||
},
|
||||
{
|
||||
|
|
|
|||
4
docs/static/stainless-llama-stack-spec.yaml
vendored
4
docs/static/stainless-llama-stack-spec.yaml
vendored
|
|
@ -13586,8 +13586,8 @@ tags:
|
|||
- Embedding models: these models generate embeddings to be used for semantic
|
||||
search.
|
||||
|
||||
- Rerank models (Experimental): these models reorder the documents based on
|
||||
their relevance to a query.
|
||||
- Rerank models: these models reorder the documents based on their relevance
|
||||
to a query.
|
||||
x-displayName: Inference
|
||||
- name: Inspect
|
||||
description: >-
|
||||
|
|
|
|||
|
|
@ -1237,7 +1237,7 @@ class Inference(InferenceProvider):
|
|||
This API provides the raw interface to the underlying models. Three kinds of models are supported:
|
||||
- LLM models: these models generate "raw" and "chat" (conversational) completions.
|
||||
- Embedding models: these models generate embeddings to be used for semantic search.
|
||||
- Rerank models (Experimental): these models reorder the documents based on their relevance to a query.
|
||||
- Rerank models: these models reorder the documents based on their relevance to a query.
|
||||
"""
|
||||
|
||||
@webmethod(route="/openai/v1/chat/completions", method="GET", level=LLAMA_STACK_API_V1, deprecated=True)
|
||||
|
|
|
|||
|
|
@ -48,6 +48,7 @@ class OpenAIMixin(NeedsRequestProviderData, ABC, BaseModel):
|
|||
- overwrite_completion_id: If True, overwrites the 'id' field in OpenAI responses
|
||||
- download_images: If True, downloads images and converts to base64 for providers that require it
|
||||
- embedding_model_metadata: A dictionary mapping model IDs to their embedding metadata
|
||||
- rerank_model_list: A list of model IDs for rerank models
|
||||
- provider_data_api_key_field: Optional field name in provider data to look for API key
|
||||
- list_provider_model_ids: Method to list available models from the provider
|
||||
- get_extra_client_params: Method to provide extra parameters to the AsyncOpenAI client
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue