Fix docker failing to start container

This commit is contained in:
Swapna Lekkala 2025-08-27 15:36:57 -07:00
parent 52106d95d3
commit 83ede71e76
4 changed files with 22 additions and 11 deletions

View file

@ -4,12 +4,12 @@
Agents API for creating and interacting with agentic systems. Agents API for creating and interacting with agentic systems.
Main functionalities provided by this API: Main functionalities provided by this API:
- Create agents with specific instructions and ability to use tools. - Create agents with specific instructions and ability to use tools.
- Interactions with agents are grouped into sessions ("threads"), and each interaction is called a "turn". - Interactions with agents are grouped into sessions ("threads"), and each interaction is called a "turn".
- Agents can be provided with various tools (see the ToolGroups and ToolRuntime APIs for more details). - Agents can be provided with various tools (see the ToolGroups and ToolRuntime APIs for more details).
- Agents can be provided with various shields (see the Safety API for more details). - Agents can be provided with various shields (see the Safety API for more details).
- Agents can also use Memory to retrieve information from knowledge bases. See the RAG Tool and Vector IO APIs for more details. - Agents can also use Memory to retrieve information from knowledge bases. See the RAG Tool and Vector IO APIs for more details.
This section contains documentation for all available providers for the **agents** API. This section contains documentation for all available providers for the **agents** API.

View file

@ -4,9 +4,9 @@
Llama Stack Inference API for generating completions, chat completions, and embeddings. Llama Stack Inference API for generating completions, chat completions, and embeddings.
This API provides the raw interface to the underlying models. Two kinds of models are supported: This API provides the raw interface to the underlying models. Two kinds of models are supported:
- LLM models: these models generate "raw" and "chat" (conversational) completions. - LLM models: these models generate "raw" and "chat" (conversational) completions.
- Embedding models: these models generate embeddings to be used for semantic search. - Embedding models: these models generate embeddings to be used for semantic search.
This section contains documentation for all available providers for the **inference** API. This section contains documentation for all available providers for the **inference** API.

View file

@ -116,7 +116,7 @@ def available_providers() -> list[ProviderSpec]:
adapter=AdapterSpec( adapter=AdapterSpec(
adapter_type="fireworks", adapter_type="fireworks",
pip_packages=[ pip_packages=[
"fireworks-ai", "fireworks-ai==0.17.16",
], ],
module="llama_stack.providers.remote.inference.fireworks", module="llama_stack.providers.remote.inference.fireworks",
config_class="llama_stack.providers.remote.inference.fireworks.FireworksImplConfig", config_class="llama_stack.providers.remote.inference.fireworks.FireworksImplConfig",

View file

@ -6,7 +6,9 @@
from typing import Any from typing import Any
from llama_stack.apis.inference import ChatCompletionRequest from openai.types.chat import ChatCompletionContentPartImageParam, ChatCompletionContentPartTextParam
from llama_stack.apis.inference import ChatCompletionRequest, RerankResponse
from llama_stack.providers.utils.inference.litellm_openai_mixin import ( from llama_stack.providers.utils.inference.litellm_openai_mixin import (
LiteLLMOpenAIMixin, LiteLLMOpenAIMixin,
) )
@ -50,3 +52,12 @@ class VertexAIInferenceAdapter(LiteLLMOpenAIMixin):
params.pop("api_key", None) params.pop("api_key", None)
return params return params
async def rerank(
self,
model: str,
query: str | ChatCompletionContentPartTextParam | ChatCompletionContentPartImageParam,
items: list[str | ChatCompletionContentPartTextParam | ChatCompletionContentPartImageParam],
max_num_results: int | None = None,
) -> RerankResponse:
raise NotImplementedError("Reranking is not supported for Vertex AI")