docs: update full list of providers with matched APIs and dockerhub images (#2452)

# What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
- add model_type in example
- change "Memory" to "VectorIO" as column name
- update index.md and README.md

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
run pre-commit to catch changes.

---------

Signed-off-by: Wen Zhou <wenzhou@redhat.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
This commit is contained in:
Wen Zhou 2025-07-03 10:12:56 +02:00 committed by GitHub
parent 5b07755556
commit 040424acf5
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 87 additions and 37 deletions

View file

@ -35,6 +35,8 @@ pip install llama-stack-client
### CLI ### CLI
```bash ```bash
# Run a chat completion # Run a chat completion
MODEL="Llama-4-Scout-17B-16E-Instruct"
llama-stack-client --endpoint http://localhost:8321 \ llama-stack-client --endpoint http://localhost:8321 \
inference chat-completion \ inference chat-completion \
--model-id meta-llama/$MODEL \ --model-id meta-llama/$MODEL \
@ -106,46 +108,59 @@ By reducing friction and complexity, Llama Stack empowers developers to focus on
### API Providers ### API Providers
Here is a list of the various API providers and available distributions that can help developers get started easily with Llama Stack. Here is a list of the various API providers and available distributions that can help developers get started easily with Llama Stack.
Please checkout for [full list](https://llama-stack.readthedocs.io/en/latest/providers/index.html)
| **API Provider Builder** | **Environments** | **Agents** | **Inference** | **Memory** | **Safety** | **Telemetry** | **Post Training** | | API Provider Builder | Environments | Agents | Inference | VectorIO | Safety | Telemetry | Post Training | Eval | DatasetIO |
|:------------------------:|:----------------------:|:----------:|:-------------:|:----------:|:----------:|:-------------:|:-----------------:| |:-------------------:|:------------:|:------:|:---------:|:--------:|:------:|:---------:|:-------------:|:----:|:--------:|
| Meta Reference | Single Node | ✅ | ✅ | ✅ | ✅ | ✅ | | | Meta Reference | Single Node | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| SambaNova | Hosted | | ✅ | | ✅ | | | | SambaNova | Hosted | | ✅ | | ✅ | | | | |
| Cerebras | Hosted | | ✅ | | | | | | Cerebras | Hosted | | ✅ | | | | | | |
| Fireworks | Hosted | ✅ | ✅ | ✅ | | | | | Fireworks | Hosted | ✅ | ✅ | ✅ | | | | | |
| AWS Bedrock | Hosted | | ✅ | | ✅ | | | | AWS Bedrock | Hosted | | ✅ | | ✅ | | | | |
| Together | Hosted | ✅ | ✅ | | ✅ | | | | Together | Hosted | ✅ | ✅ | | ✅ | | | | |
| Groq | Hosted | | ✅ | | | | | | Groq | Hosted | | ✅ | | | | | | |
| Ollama | Single Node | | ✅ | | | | | | Ollama | Single Node | | ✅ | | | | | | |
| TGI | Hosted and Single Node | | ✅ | | | | | | TGI | Hosted/Single Node | | ✅ | | | | | | |
| NVIDIA NIM | Hosted and Single Node | | ✅ | | | | | | NVIDIA NIM | Hosted/Single Node | | ✅ | | ✅ | | | | |
| Chroma | Single Node | | | ✅ | | | | | ChromaDB | Hosted/Single Node | | | ✅ | | | | | |
| PG Vector | Single Node | | | ✅ | | | | | PG Vector | Single Node | | | ✅ | | | | | |
| PyTorch ExecuTorch | On-device iOS | ✅ | ✅ | | | | | | PyTorch ExecuTorch | On-device iOS | ✅ | ✅ | | | | | | |
| vLLM | Hosted and Single Node | | ✅ | | | | | | vLLM | Single Node | | ✅ | | | | | | |
| OpenAI | Hosted | | ✅ | | | | | | OpenAI | Hosted | | ✅ | | | | | | |
| Anthropic | Hosted | | ✅ | | | | | | Anthropic | Hosted | | ✅ | | | | | | |
| Gemini | Hosted | | ✅ | | | | | | Gemini | Hosted | | ✅ | | | | | | |
| watsonx | Hosted | | ✅ | | | | | | WatsonX | Hosted | | ✅ | | | | | | |
| HuggingFace | Single Node | | | | | | ✅ | | HuggingFace | Single Node | | | | | | ✅ | | ✅ |
| TorchTune | Single Node | | | | | | ✅ | | TorchTune | Single Node | | | | | | ✅ | | |
| NVIDIA NEMO | Hosted | | | | | | ✅ | | NVIDIA NEMO | Hosted | | ✅ | ✅ | | | ✅ | ✅ | ✅ |
| NVIDIA | Hosted | | | | | | ✅ | ✅ | ✅ |
> **Note**: Additional providers are available through external packages. See [External Providers](https://llama-stack.readthedocs.io/en/latest/providers/external.html) documentation.
### Distributions ### Distributions
A Llama Stack Distribution (or "distro") is a pre-configured bundle of provider implementations for each API component. Distributions make it easy to get started with a specific deployment scenario - you can begin with a local development setup (eg. ollama) and seamlessly transition to production (eg. Fireworks) without changing your application code. Here are some of the distributions we support: A Llama Stack Distribution (or "distro") is a pre-configured bundle of provider implementations for each API component. Distributions make it easy to get started with a specific deployment scenario - you can begin with a local development setup (eg. ollama) and seamlessly transition to production (eg. Fireworks) without changing your application code.
Here are some of the distributions we support:
| **Distribution** | **Llama Stack Docker** | Start This Distribution | | **Distribution** | **Llama Stack Docker** | Start This Distribution |
|:---------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------:| |:---------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------:|
| Meta Reference | [llamastack/distribution-meta-reference-gpu](https://hub.docker.com/repository/docker/llamastack/distribution-meta-reference-gpu/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/meta-reference-gpu.html) | | Meta Reference | [llamastack/distribution-meta-reference-gpu](https://hub.docker.com/repository/docker/llamastack/distribution-meta-reference-gpu/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/meta-reference-gpu.html) |
| SambaNova | [llamastack/distribution-sambanova](https://hub.docker.com/repository/docker/llamastack/distribution-sambanova/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/sambanova.html) | | TGI | [llamastack/distribution-tgi](https://hub.docker.com/repository/docker/llamastack/distribution-tgi/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/tgi.html)
| Cerebras | [llamastack/distribution-cerebras](https://hub.docker.com/repository/docker/llamastack/distribution-cerebras/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/cerebras.html) | | vLLM | [llamastack/distribution-remote-vllm](https://hub.docker.com/repository/docker/llamastack/distribution-remote-vllm/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/remote-vllm.html)
| Starter | [llamastack/distribution-starter](https://hub.docker.com/repository/docker/llamastack/distribution-starter/general) | |
| PostgreSQL | [llamastack/distribution-postgres-demo](https://hub.docker.com/repository/docker/llamastack/distribution-postgres-demo/general) | |
Here are the ones out of support scope but still avaiable from Dockerhub:
| **Distribution** | **Llama Stack Docker** | Start This Distribution |
|:---------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------:|
| Ollama | [llamastack/distribution-ollama](https://hub.docker.com/repository/docker/llamastack/distribution-ollama/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/ollama.html) | | Ollama | [llamastack/distribution-ollama](https://hub.docker.com/repository/docker/llamastack/distribution-ollama/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/ollama.html) |
| TGI | [llamastack/distribution-tgi](https://hub.docker.com/repository/docker/llamastack/distribution-tgi/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/tgi.html) |
| Together | [llamastack/distribution-together](https://hub.docker.com/repository/docker/llamastack/distribution-together/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/together.html) | | Together | [llamastack/distribution-together](https://hub.docker.com/repository/docker/llamastack/distribution-together/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/together.html) |
| Fireworks | [llamastack/distribution-fireworks](https://hub.docker.com/repository/docker/llamastack/distribution-fireworks/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/fireworks.html) | | Fireworks | [llamastack/distribution-fireworks](https://hub.docker.com/repository/docker/llamastack/distribution-fireworks/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/fireworks.html) |
| vLLM | [llamastack/distribution-remote-vllm](https://hub.docker.com/repository/docker/llamastack/distribution-remote-vllm/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/remote-vllm.html) | | AWS Bedrock | [llamastack/distribution-bedrock](https://hub.docker.com/repository/docker/llamastack/distribution-bedrock/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/bedrock.html) |
| SambaNova | [llamastack/distribution-sambanova](https://hub.docker.com/repository/docker/llamastack/distribution-sambanova/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/sambanova.html) |
| Cerebras | [llamastack/distribution-cerebras](https://hub.docker.com/repository/docker/llamastack/distribution-cerebras/general) | [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/cerebras.html) | | | |
### Documentation ### Documentation

View file

@ -80,7 +80,7 @@ providers:
# provider_id is a string you can choose freely # provider_id is a string you can choose freely
- provider_id: ollama - provider_id: ollama
# provider_type is a string that specifies the type of provider. # provider_type is a string that specifies the type of provider.
# in this case, the provider for inference is ollama and it is run remotely (outside of the distribution) # in this case, the provider for inference is ollama and it runs remotely (outside of the distribution)
provider_type: remote::ollama provider_type: remote::ollama
# config is a dictionary that contains the configuration for the provider. # config is a dictionary that contains the configuration for the provider.
# in this case, the configuration is the url of the ollama server # in this case, the configuration is the url of the ollama server
@ -88,7 +88,7 @@ providers:
url: ${env.OLLAMA_URL:=http://localhost:11434} url: ${env.OLLAMA_URL:=http://localhost:11434}
``` ```
A few things to note: A few things to note:
- A _provider instance_ is identified with an (id, type, configuration) triplet. - A _provider instance_ is identified with an (id, type, config) triplet.
- The id is a string you can choose freely. - The id is a string you can choose freely.
- You can instantiate any number of provider instances of the same type. - You can instantiate any number of provider instances of the same type.
- The configuration dictionary is provider-specific. - The configuration dictionary is provider-specific.
@ -187,7 +187,7 @@ The environment variable substitution system is type-safe:
## Resources ## Resources
Finally, let's look at the `models` section: Let's look at the `models` section:
```yaml ```yaml
models: models:
@ -195,8 +195,9 @@ models:
model_id: ${env.INFERENCE_MODEL} model_id: ${env.INFERENCE_MODEL}
provider_id: ollama provider_id: ollama
provider_model_id: null provider_model_id: null
model_type: llm
``` ```
A Model is an instance of a "Resource" (see [Concepts](../concepts/index)) and is associated with a specific inference provider (in this case, the provider with identifier `ollama`). This is an instance of a "pre-registered" model. While we always encourage the clients to always register models before using them, some Stack servers may come up a list of "already known and available" models. A Model is an instance of a "Resource" (see [Concepts](../concepts/index)) and is associated with a specific inference provider (in this case, the provider with identifier `ollama`). This is an instance of a "pre-registered" model. While we always encourage the clients to register models before using them, some Stack servers may come up a list of "already known and available" models.
What's with the `provider_model_id` field? This is an identifier for the model inside the provider's model catalog. Contrast it with `model_id` which is the identifier for the same model for Llama Stack's purposes. For example, you may want to name "llama3.2:vision-11b" as "image_captioning_model" when you use it in your Stack interactions. When omitted, the server will set `provider_model_id` to be the same as `model_id`. What's with the `provider_model_id` field? This is an identifier for the model inside the provider's model catalog. Contrast it with `model_id` which is the identifier for the same model for Llama Stack's purposes. For example, you may want to name "llama3.2:vision-11b" as "image_captioning_model" when you use it in your Stack interactions. When omitted, the server will set `provider_model_id` to be the same as `model_id`.

View file

@ -73,17 +73,26 @@ A number of "adapters" are available for some popular Inference and Vector Store
| OpenAI | Hosted | | OpenAI | Hosted |
| Anthropic | Hosted | | Anthropic | Hosted |
| Gemini | Hosted | | Gemini | Hosted |
| WatsonX | Hosted |
**Agents API**
| **Provider** | **Environments** |
| :----: | :----: |
| Meta Reference | Single Node |
| Fireworks | Hosted |
| Together | Hosted |
| PyTorch ExecuTorch | On-device iOS |
**Vector IO API** **Vector IO API**
| **Provider** | **Environments** | | **Provider** | **Environments** |
| :----: | :----: | | :----: | :----: |
| FAISS | Single Node | | FAISS | Single Node |
| SQLite-Vec| Single Node | | SQLite-Vec | Single Node |
| Chroma | Hosted and Single Node | | Chroma | Hosted and Single Node |
| Milvus | Hosted and Single Node | | Milvus | Hosted and Single Node |
| Postgres (PGVector) | Hosted and Single Node | | Postgres (PGVector) | Hosted and Single Node |
| Weaviate | Hosted | | Weaviate | Hosted |
| Qdrant | Hosted and Single Node |
**Safety API** **Safety API**
| **Provider** | **Environments** | | **Provider** | **Environments** |
@ -93,6 +102,30 @@ A number of "adapters" are available for some popular Inference and Vector Store
| Code Scanner | Single Node | | Code Scanner | Single Node |
| AWS Bedrock | Hosted | | AWS Bedrock | Hosted |
**Post Training API**
| **Provider** | **Environments** |
| :----: | :----: |
| Meta Reference | Single Node |
| HuggingFace | Single Node |
| TorchTune | Single Node |
| NVIDIA NEMO | Hosted |
**Eval API**
| **Provider** | **Environments** |
| :----: | :----: |
| Meta Reference | Single Node |
| NVIDIA NEMO | Hosted |
**Telemetry API**
| **Provider** | **Environments** |
| :----: | :----: |
| Meta Reference | Single Node |
**Tool Runtime API**
| **Provider** | **Environments** |
| :----: | :----: |
| Brave Search | Hosted |
| RAG Runtime | Single Node |
```{toctree} ```{toctree}
:hidden: :hidden:

View file

@ -1,9 +1,10 @@
# Providers Overview # Providers Overview
The goal of Llama Stack is to build an ecosystem where users can easily swap out different implementations for the same API. Examples for these include: The goal of Llama Stack is to build an ecosystem where users can easily swap out different implementations for the same API. Examples for these include:
- LLM inference providers (e.g., Ollama, Fireworks, Together, AWS Bedrock, Groq, Cerebras, SambaNova, vLLM, etc.), - LLM inference providers (e.g., Meta Reference, Ollama, Fireworks, Together, AWS Bedrock, Groq, Cerebras, SambaNova, vLLM, OpenAI, Anthropic, Gemini, WatsonX, etc.),
- Vector databases (e.g., ChromaDB, Weaviate, Qdrant, Milvus, FAISS, PGVector, SQLite-Vec, etc.), - Vector databases (e.g., FAISS, SQLite-Vec, ChromaDB, Weaviate, Qdrant, Milvus, PGVector, etc.),
- Safety providers (e.g., Meta's Llama Guard, AWS Bedrock Guardrails, etc.) - Safety providers (e.g., Meta's Llama Guard, Prompt Guard, Code Scanner, AWS Bedrock Guardrails, etc.),
- Tool Runtime providers (e.g., RAG Runtime, Brave Search, etc.)
Providers come in two flavors: Providers come in two flavors:
- **Remote**: the provider runs as a separate service external to the Llama Stack codebase. Llama Stack contains a small amount of adapter code. - **Remote**: the provider runs as a separate service external to the Llama Stack codebase. Llama Stack contains a small amount of adapter code.