mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-06-27 18:50:41 +00:00
.. | ||
bedrock | ||
databricks | ||
fireworks | ||
hf-endpoint | ||
hf-serverless | ||
meta-reference-gpu | ||
ollama | ||
tgi | ||
together | ||
vllm | ||
README.md |
Llama Stack Distribution
A Distribution is where APIs and Providers are assembled together to provide a consistent whole to the end application developer. You can mix-and-match providers -- some could be backed by local code and some could be remote. As a hobbyist, you can serve a small model locally, but can choose a cloud provider for a large model. Regardless, the higher level APIs your app needs to work with don't need to change at all. You can even imagine moving across the server / mobile-device boundary as well always using the same uniform set of APIs for developing Generative AI applications.
Quick Start Llama Stack Distributions Guide
Distribution | Llama Stack Docker | Start This Distribution | Inference | Agents | Memory | Safety | Telemetry |
---|---|---|---|---|---|---|---|
Meta Reference | llamastack/distribution-meta-reference-gpu | Guide | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
Ollama | llamastack/distribution-ollama | Guide | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
TGI | llamastack/distribution-tgi | Guide | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
Together | llamastack/distribution-together | Guide | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
Fireworks | llamastack/distribution-fireworks | Guide | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |