mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-07 12:47:37 +00:00
Merge 6b4940806f
into 188a56af5c
This commit is contained in:
commit
1e04f105f2
20 changed files with 840 additions and 34 deletions
|
@ -1,9 +1,10 @@
|
|||
---
|
||||
description: "Llama Stack Inference API for generating completions, chat completions, and embeddings.
|
||||
|
||||
This API provides the raw interface to the underlying models. Two kinds of models are supported:
|
||||
This API provides the raw interface to the underlying models. Three kinds of models are supported:
|
||||
- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.
|
||||
- Embedding models: these models generate embeddings to be used for semantic search."
|
||||
- Embedding models: these models generate embeddings to be used for semantic search.
|
||||
- Rerank models: these models reorder the documents based on their relevance to a query."
|
||||
sidebar_label: Inference
|
||||
title: Inference
|
||||
---
|
||||
|
@ -14,8 +15,9 @@ title: Inference
|
|||
|
||||
Llama Stack Inference API for generating completions, chat completions, and embeddings.
|
||||
|
||||
This API provides the raw interface to the underlying models. Two kinds of models are supported:
|
||||
This API provides the raw interface to the underlying models. Three kinds of models are supported:
|
||||
- LLM models: these models generate "raw" and "chat" (conversational) completions.
|
||||
- Embedding models: these models generate embeddings to be used for semantic search.
|
||||
- Rerank models: these models reorder the documents based on their relevance to a query.
|
||||
|
||||
This section contains documentation for all available providers for the **inference** API.
|
||||
|
|
2
docs/static/deprecated-llama-stack-spec.html
vendored
2
docs/static/deprecated-llama-stack-spec.html
vendored
|
@ -13335,7 +13335,7 @@
|
|||
},
|
||||
{
|
||||
"name": "Inference",
|
||||
"description": "This API provides the raw interface to the underlying models. Two kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.",
|
||||
"description": "This API provides the raw interface to the underlying models. Three kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.\n- Rerank models: these models reorder the documents based on their relevance to a query.",
|
||||
"x-displayName": "Llama Stack Inference API for generating completions, chat completions, and embeddings."
|
||||
},
|
||||
{
|
||||
|
|
7
docs/static/deprecated-llama-stack-spec.yaml
vendored
7
docs/static/deprecated-llama-stack-spec.yaml
vendored
|
@ -9990,13 +9990,16 @@ tags:
|
|||
description: ''
|
||||
- name: Inference
|
||||
description: >-
|
||||
This API provides the raw interface to the underlying models. Two kinds of models
|
||||
are supported:
|
||||
This API provides the raw interface to the underlying models. Three kinds of
|
||||
models are supported:
|
||||
|
||||
- LLM models: these models generate "raw" and "chat" (conversational) completions.
|
||||
|
||||
- Embedding models: these models generate embeddings to be used for semantic
|
||||
search.
|
||||
|
||||
- Rerank models: these models reorder the documents based on their relevance
|
||||
to a query.
|
||||
x-displayName: >-
|
||||
Llama Stack Inference API for generating completions, chat completions, and
|
||||
embeddings.
|
||||
|
|
|
@ -4992,7 +4992,7 @@
|
|||
"properties": {
|
||||
"model": {
|
||||
"type": "string",
|
||||
"description": "The identifier of the reranking model to use."
|
||||
"description": "The identifier of the reranking model to use. The model must be a reranking model registered with Llama Stack and available via the /models endpoint."
|
||||
},
|
||||
"query": {
|
||||
"oneOf": [
|
||||
|
|
|
@ -3657,7 +3657,8 @@ components:
|
|||
model:
|
||||
type: string
|
||||
description: >-
|
||||
The identifier of the reranking model to use.
|
||||
The identifier of the reranking model to use. The model must be a reranking
|
||||
model registered with Llama Stack and available via the /models endpoint.
|
||||
query:
|
||||
oneOf:
|
||||
- type: string
|
||||
|
|
5
docs/static/llama-stack-spec.html
vendored
5
docs/static/llama-stack-spec.html
vendored
|
@ -6829,7 +6829,8 @@
|
|||
"type": "string",
|
||||
"enum": [
|
||||
"llm",
|
||||
"embedding"
|
||||
"embedding",
|
||||
"rerank"
|
||||
],
|
||||
"title": "ModelType",
|
||||
"description": "Enumeration of supported model types in Llama Stack."
|
||||
|
@ -12883,7 +12884,7 @@
|
|||
},
|
||||
{
|
||||
"name": "Inference",
|
||||
"description": "This API provides the raw interface to the underlying models. Two kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.",
|
||||
"description": "This API provides the raw interface to the underlying models. Three kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.\n- Rerank models: these models reorder the documents based on their relevance to a query.",
|
||||
"x-displayName": "Llama Stack Inference API for generating completions, chat completions, and embeddings."
|
||||
},
|
||||
{
|
||||
|
|
8
docs/static/llama-stack-spec.yaml
vendored
8
docs/static/llama-stack-spec.yaml
vendored
|
@ -5158,6 +5158,7 @@ components:
|
|||
enum:
|
||||
- llm
|
||||
- embedding
|
||||
- rerank
|
||||
title: ModelType
|
||||
description: >-
|
||||
Enumeration of supported model types in Llama Stack.
|
||||
|
@ -9728,13 +9729,16 @@ tags:
|
|||
description: ''
|
||||
- name: Inference
|
||||
description: >-
|
||||
This API provides the raw interface to the underlying models. Two kinds of models
|
||||
are supported:
|
||||
This API provides the raw interface to the underlying models. Three kinds of
|
||||
models are supported:
|
||||
|
||||
- LLM models: these models generate "raw" and "chat" (conversational) completions.
|
||||
|
||||
- Embedding models: these models generate embeddings to be used for semantic
|
||||
search.
|
||||
|
||||
- Rerank models: these models reorder the documents based on their relevance
|
||||
to a query.
|
||||
x-displayName: >-
|
||||
Llama Stack Inference API for generating completions, chat completions, and
|
||||
embeddings.
|
||||
|
|
7
docs/static/stainless-llama-stack-spec.html
vendored
7
docs/static/stainless-llama-stack-spec.html
vendored
|
@ -8838,7 +8838,8 @@
|
|||
"type": "string",
|
||||
"enum": [
|
||||
"llm",
|
||||
"embedding"
|
||||
"embedding",
|
||||
"rerank"
|
||||
],
|
||||
"title": "ModelType",
|
||||
"description": "Enumeration of supported model types in Llama Stack."
|
||||
|
@ -17033,7 +17034,7 @@
|
|||
"properties": {
|
||||
"model": {
|
||||
"type": "string",
|
||||
"description": "The identifier of the reranking model to use."
|
||||
"description": "The identifier of the reranking model to use. The model must be a reranking model registered with Llama Stack and available via the /models endpoint."
|
||||
},
|
||||
"query": {
|
||||
"oneOf": [
|
||||
|
@ -18456,7 +18457,7 @@
|
|||
},
|
||||
{
|
||||
"name": "Inference",
|
||||
"description": "This API provides the raw interface to the underlying models. Two kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.",
|
||||
"description": "This API provides the raw interface to the underlying models. Three kinds of models are supported:\n- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.\n- Embedding models: these models generate embeddings to be used for semantic search.\n- Rerank models: these models reorder the documents based on their relevance to a query.",
|
||||
"x-displayName": "Llama Stack Inference API for generating completions, chat completions, and embeddings."
|
||||
},
|
||||
{
|
||||
|
|
11
docs/static/stainless-llama-stack-spec.yaml
vendored
11
docs/static/stainless-llama-stack-spec.yaml
vendored
|
@ -6603,6 +6603,7 @@ components:
|
|||
enum:
|
||||
- llm
|
||||
- embedding
|
||||
- rerank
|
||||
title: ModelType
|
||||
description: >-
|
||||
Enumeration of supported model types in Llama Stack.
|
||||
|
@ -12693,7 +12694,8 @@ components:
|
|||
model:
|
||||
type: string
|
||||
description: >-
|
||||
The identifier of the reranking model to use.
|
||||
The identifier of the reranking model to use. The model must be a reranking
|
||||
model registered with Llama Stack and available via the /models endpoint.
|
||||
query:
|
||||
oneOf:
|
||||
- type: string
|
||||
|
@ -13774,13 +13776,16 @@ tags:
|
|||
description: ''
|
||||
- name: Inference
|
||||
description: >-
|
||||
This API provides the raw interface to the underlying models. Two kinds of models
|
||||
are supported:
|
||||
This API provides the raw interface to the underlying models. Three kinds of
|
||||
models are supported:
|
||||
|
||||
- LLM models: these models generate "raw" and "chat" (conversational) completions.
|
||||
|
||||
- Embedding models: these models generate embeddings to be used for semantic
|
||||
search.
|
||||
|
||||
- Rerank models: these models reorder the documents based on their relevance
|
||||
to a query.
|
||||
x-displayName: >-
|
||||
Llama Stack Inference API for generating completions, chat completions, and
|
||||
embeddings.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue