mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-07-07 06:20:45 +00:00
# What does this PR do? * Removes a bunch of distros * Removed distros were added into the "starter" distribution * Doc for "starter" has been added * Partially reverts https://github.com/meta-llama/llama-stack/pull/2482 since inference providers are disabled by default and can be turned on manually via env variable. * Disables safety in starter distro Closes: https://github.com/meta-llama/llama-stack/issues/2502. ~Needs: https://github.com/meta-llama/llama-stack/pull/2482 for Ollama to work properly in the CI.~ TODO: - [ ] We can only update `install.sh` when we get a new release. - [x] Update providers documentation - [ ] Update notebooks to reference starter instead of ollama Signed-off-by: Sébastien Han <seb@redhat.com>
3.3 KiB
3.3 KiB
Available Distributions
Llama Stack provides several pre-configured distributions to help you get started quickly. Choose the distribution that best fits your hardware and use case.
Quick Reference
Distribution | Use Case | Hardware Requirements | Provider |
---|---|---|---|
distribution-starter |
General purpose, prototyping | Any (CPU/GPU) | Ollama, Remote APIs |
distribution-meta-reference-gpu |
High-performance inference | GPU required | Local GPU inference |
Remote-hosted | Production, managed service | None | Partner providers |
iOS/Android SDK | Mobile applications | Mobile device | On-device inference |
Choose Your Distribution
🚀 Getting Started (Recommended for Beginners)
Use distribution-starter
if you want to:
- Prototype quickly without GPU requirements
- Use remote inference providers (Fireworks, Together, vLLM etc.)
- Run locally with Ollama for development
docker pull llama-stack/distribution-starter
Guides: Starter Distribution Guide
🖥️ Self-Hosted with GPU
Use distribution-meta-reference-gpu
if you:
- Have access to GPU hardware
- Want maximum performance and control
- Need to run inference locally
docker pull llama-stack/distribution-meta-reference-gpu
Guides: Meta Reference GPU Guide
☁️ Managed Hosting
Use remote-hosted endpoints if you:
- Don't want to manage infrastructure
- Need production-ready reliability
- Prefer managed services
Partners: Fireworks.ai and Together.xyz
Guides: Remote-Hosted Endpoints
📱 Mobile Development
Use mobile SDKs if you:
-
Are building iOS or Android applications
-
Need on-device inference capabilities
-
Want offline functionality
🔧 Custom Solutions
Build your own distribution if:
- None of the above fit your specific needs
- You need custom configurations
- You want to optimize for your specific use case
Guides: Building Custom Distributions
Detailed Documentation
Self-Hosted Distributions
:maxdepth: 1
self_hosted_distro/starter
self_hosted_distro/meta-reference-gpu
Remote-Hosted Solutions
:maxdepth: 1
remote_hosted_distro/index
Mobile SDKs
:maxdepth: 1
ondevice_distro/ios_sdk
ondevice_distro/android_sdk
Decision Flow
graph TD
A[What's your use case?] --> B{Need mobile app?}
B -->|Yes| C[Use Mobile SDKs]
B -->|No| D{Have GPU hardware?}
D -->|Yes| E[Use Meta Reference GPU]
D -->|No| F{Want managed hosting?}
F -->|Yes| G[Use Remote-Hosted]
F -->|No| H[Use Starter Distribution]
Next Steps
- Choose your distribution from the options above
- Follow the setup guide for your selected distribution
- Configure your providers with API keys or local models
- Start building with Llama Stack!
For help choosing or troubleshooting, check our Getting Started Guide or Community Support.