Fix rerank integration test based on client side changes

This commit is contained in:
Jiayi 2025-10-01 10:37:58 -07:00
parent bb2eb33fc3
commit 6b4940806f
8 changed files with 27 additions and 276 deletions

View file

@ -6603,6 +6603,7 @@ components:
enum:
- llm
- embedding
- rerank
title: ModelType
description: >-
Enumeration of supported model types in Llama Stack.
@ -12693,7 +12694,8 @@ components:
model:
type: string
description: >-
The identifier of the reranking model to use.
The identifier of the reranking model to use. The model must be a reranking
model registered with Llama Stack and available via the /models endpoint.
query:
oneOf:
- type: string
@ -13774,13 +13776,16 @@ tags:
description: ''
- name: Inference
description: >-
This API provides the raw interface to the underlying models. Two kinds of models
are supported:
This API provides the raw interface to the underlying models. Three kinds of
models are supported:
- LLM models: these models generate "raw" and "chat" (conversational) completions.
- Embedding models: these models generate embeddings to be used for semantic
search.
- Rerank models: these models reorder the documents based on their relevance
to a query.
x-displayName: >-
Llama Stack Inference API for generating completions, chat completions, and
embeddings.