mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-06-28 19:04:19 +00:00
# What does this PR do? This is a follow-up of https://github.com/meta-llama/llama-stack/issues/965 to avoid mentioning exclusive support on Llama models. --------- Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
62 lines
2.5 KiB
Markdown
62 lines
2.5 KiB
Markdown
# Why Llama Stack?
|
|
|
|
Building production AI applications today requires solving multiple challenges:
|
|
|
|
**Infrastructure Complexity**
|
|
- Running large language models efficiently requires specialized infrastructure.
|
|
- Different deployment scenarios (local development, cloud, edge) need different solutions.
|
|
- Moving from development to production often requires significant rework.
|
|
|
|
**Essential Capabilities**
|
|
- Safety guardrails and content filtering are necessary in an enterprise setting.
|
|
- Just model inference is not enough - Knowledge retrieval and RAG capabilities are required.
|
|
- Nearly any application needs composable multi-step workflows.
|
|
- Finally, without monitoring, observability and evaluation, you end up operating in the dark.
|
|
|
|
**Lack of Flexibility and Choice**
|
|
- Directly integrating with multiple providers creates tight coupling.
|
|
- Different providers have different APIs and abstractions.
|
|
- Changing providers requires significant code changes.
|
|
|
|
|
|
### Our Solution: A Universal Stack
|
|
|
|
```{image} ../../_static/llama-stack.png
|
|
:alt: Llama Stack
|
|
:width: 400px
|
|
```
|
|
|
|
Llama Stack addresses these challenges through a service-oriented, API-first approach:
|
|
|
|
**Develop Anywhere, Deploy Everywhere**
|
|
- Start locally with CPU-only setups
|
|
- Move to GPU acceleration when needed
|
|
- Deploy to cloud or edge without code changes
|
|
- Same APIs and developer experience everywhere
|
|
|
|
**Production-Ready Building Blocks**
|
|
- Pre-built safety guardrails and content filtering
|
|
- Built-in RAG and agent capabilities
|
|
- Comprehensive evaluation toolkit
|
|
- Full observability and monitoring
|
|
|
|
**True Provider Independence**
|
|
- Swap providers without application changes
|
|
- Mix and match best-in-class implementations
|
|
- Federation and fallback support
|
|
- No vendor lock-in
|
|
|
|
**Robust Ecosystem**
|
|
- Llama Stack is already integrated with distribution partners (cloud providers, hardware vendors, and AI-focused companies).
|
|
- Ecosystem offers tailored infrastructure, software, and services for deploying a variety of models.
|
|
|
|
|
|
### Our Philosophy
|
|
|
|
- **Service-Oriented**: REST APIs enforce clean interfaces and enable seamless transitions across different environments.
|
|
- **Composability**: Every component is independent but works together seamlessly
|
|
- **Production Ready**: Built for real-world applications, not just demos
|
|
- **Turnkey Solutions**: Easy to deploy built in solutions for popular deployment scenarios
|
|
|
|
|
|
With Llama Stack, you can focus on building your application while we handle the infrastructure complexity, essential capabilities, and provider integrations.
|