mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-25 13:32:00 +00:00
- keep old provider table in README.md - get full list of provider table into "docs" index.md - move docker images for distro we do not maintain into a separate table Signed-off-by: Wen Zhou <wenzhou@redhat.com>
163 lines
9.5 KiB
Markdown
163 lines
9.5 KiB
Markdown
# Providers Overview
|
|
|
|
The goal of Llama Stack is to build an ecosystem where users can easily swap out different implementations for the same API. Examples for these include:
|
|
- LLM inference providers (e.g., Meta Reference, Ollama, Fireworks, Together, AWS Bedrock, Groq, Cerebras, SambaNova, vLLM, OpenAI, Anthropic, Gemini, WatsonX, etc.),
|
|
- Vector databases (e.g., FAISS, SQLite-Vec, ChromaDB, Weaviate, Qdrant, Milvus, PGVector, etc.),
|
|
- Safety providers (e.g., Meta's Llama Guard, Prompt Guard, Code Scanner, AWS Bedrock Guardrails, etc.),
|
|
- Tool Runtime providers (e.g., RAG Runtime, Brave Search, etc.)
|
|
|
|
Providers come in two flavors:
|
|
- **Remote**: the provider runs as a separate service external to the Llama Stack codebase. Llama Stack contains a small amount of adapter code.
|
|
- **Inline**: the provider is fully specified and implemented within the Llama Stack codebase. It may be a simple wrapper around an existing library, or a full fledged implementation within Llama Stack.
|
|
|
|
Importantly, Llama Stack always strives to provide at least one fully inline provider for each API so you can iterate on a fully featured environment locally.
|
|
|
|
## Available Providers
|
|
|
|
Here is a comprehensive list of all available API providers in Llama Stack:
|
|
|
|
| API Provider Builder | Environments | Agents | Inference | VectorIO | Safety | Telemetry | Post Training | Eval | DatasetIO |Tool Runtime| Scoring |
|
|
|:----------------------:|:------------------:|:------:|:---------:|:--------:|:------:|:---------:|:-------------:|:----:|:---------:|:----------:|:-------:|
|
|
| Meta Reference | Single Node | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | |
|
|
| SambaNova | Hosted | | ✅ | | ✅ | | | | | | |
|
|
| Cerebras | Hosted | | ✅ | | | | | | | | |
|
|
| Fireworks | Hosted | ✅ | ✅ | ✅ | | | | | | | |
|
|
| AWS Bedrock | Hosted | | ✅ | | ✅ | | | | | | |
|
|
| Together | Hosted | ✅ | ✅ | | ✅ | | | | | | |
|
|
| Groq | Hosted | | ✅ | | | | | | | | |
|
|
| Ollama | Single Node | | ✅ | | | | | | | | |
|
|
| TGI | Hosted/Single Node | | ✅ | | | | | | | | |
|
|
| NVIDIA NIM | Hosted/Single Node | | ✅ | | ✅ | | | | | | |
|
|
| ChromaDB | Hosted/Single Node | | | ✅ | | | | | | | |
|
|
| PG Vector | Single Node | | | ✅ | | | | | | | |
|
|
| vLLM | Single Node | | ✅ | | | | | | | | |
|
|
| OpenAI | Hosted | | ✅ | | | | | | | | |
|
|
| Anthropic | Hosted | | ✅ | | | | | | | | |
|
|
| Gemini | Hosted | | ✅ | | | | | | | | |
|
|
| WatsonX | Hosted | | ✅ | | | | | | | | |
|
|
| HuggingFace | Single Node | | | | | | ✅ | | ✅ | | |
|
|
| TorchTune | Single Node | | | | | | ✅ | | | | |
|
|
| NVIDIA NEMO | Hosted | | ✅ | ✅ | | | ✅ | ✅ | ✅ | | |
|
|
| NVIDIA | Hosted | | | | | | ✅ | ✅ | ✅ | | |
|
|
| FAISS | Single Node | | | ✅ | | | | | | | |
|
|
| SQLite-Vec | Single Node | | | ✅ | | | | | | | |
|
|
| Qdrant | Hosted/Single Node | | | ✅ | | | | | | | |
|
|
| Weaviate | Hosted | | | ✅ | | | | | | | |
|
|
| Milvus | Hosted/Single Node | | | ✅ | | | | | | | |
|
|
| Prompt Guard | Single Node | | | | ✅ | | | | | | |
|
|
| Llama Guard | Single Node | | | | ✅ | | | | | | |
|
|
| Code Scanner | Single Node | | | | ✅ | | | | | | |
|
|
| Brave Search | Hosted | | | | | | | | | ✅ | |
|
|
| Bing Search | Hosted | | | | | | | | | ✅ | |
|
|
| RAG Runtime | Single Node | | | | | | | | | ✅ | |
|
|
| Model Context Protocol | Hosted | | | | | | | | | ✅ | |
|
|
| Sentence Transformers | Single Node | | ✅ | | | | | | | | |
|
|
| Braintrust | Single Node | | | | | | | | | | ✅ |
|
|
| Basic | Single Node | | | | | | | | | | ✅ |
|
|
| LLM-as-Judge | Single Node | | | | | | | | | | ✅ |
|
|
| Databricks | Hosted | | ✅ | | | | | | | | |
|
|
| RunPod | Hosted | | ✅ | | | | | | | | |
|
|
| Passthrough | Hosted | | ✅ | | | | | | | | |
|
|
| PyTorch ExecuTorch | On-device iOS, Android | ✅ | ✅ | | | | | | | | |
|
|
## External Providers
|
|
|
|
Llama Stack supports external providers that live outside of the main codebase. This allows you to create and maintain your own providers independently.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
external
|
|
```
|
|
|
|
## Agents
|
|
Run multi-step agentic workflows with LLMs with tool usage, memory (RAG), etc.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
agents/index
|
|
```
|
|
|
|
## DatasetIO
|
|
Interfaces with datasets and data loaders.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
datasetio/index
|
|
```
|
|
|
|
## Eval
|
|
Generates outputs (via Inference or Agents) and perform scoring.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
eval/index
|
|
```
|
|
|
|
## Inference
|
|
Runs inference with an LLM.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
inference/index
|
|
```
|
|
|
|
## Post Training
|
|
Fine-tunes a model.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
post_training/index
|
|
```
|
|
|
|
## Safety
|
|
Applies safety policies to the output at a Systems (not only model) level.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
safety/index
|
|
```
|
|
|
|
## Scoring
|
|
Evaluates the outputs of the system.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
scoring/index
|
|
```
|
|
|
|
## Telemetry
|
|
Collects telemetry data from the system.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
telemetry/index
|
|
```
|
|
|
|
## Tool Runtime
|
|
Is associated with the ToolGroup resouces.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
tool_runtime/index
|
|
```
|
|
|
|
## Vector IO
|
|
|
|
Vector IO refers to operations on vector databases, such as adding documents, searching, and deleting documents.
|
|
Vector IO plays a crucial role in [Retreival Augmented Generation (RAG)](../..//building_applications/rag), where the vector
|
|
io and database are used to store and retrieve documents for retrieval.
|
|
|
|
```{toctree}
|
|
:maxdepth: 1
|
|
|
|
vector_io/index
|
|
```
|