mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-06-27 18:50:41 +00:00
# What does this PR do? Update README and other documentation ## Before submitting - [X] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.
2.3 KiB
2.3 KiB
Why Llama Stack?
Building production AI applications today requires solving multiple challenges:
Infrastructure Complexity
- Running large language models efficiently requires specialized infrastructure.
- Different deployment scenarios (local development, cloud, edge) need different solutions.
- Moving from development to production often requires significant rework.
Essential Capabilities
- Safety guardrails and content filtering are necessary in an enterprise setting.
- Just model inference is not enough - Knowledge retrieval and RAG capabilities are required.
- Nearly any application needs composable multi-step workflows.
- Finally, without monitoring, observability and evaluation, you end up operating in the dark.
Lack of Flexibility and Choice
- Directly integrating with multiple providers creates tight coupling.
- Different providers have different APIs and abstractions.
- Changing providers requires significant code changes.
Our Solution: A Universal Stack
:alt: Llama Stack
:width: 400px
Llama Stack addresses these challenges through a service-oriented, API-first approach:
Develop Anywhere, Deploy Everywhere
- Start locally with CPU-only setups
- Move to GPU acceleration when needed
- Deploy to cloud or edge without code changes
- Same APIs and developer experience everywhere
Production-Ready Building Blocks
- Pre-built safety guardrails and content filtering
- Built-in RAG and agent capabilities
- Comprehensive evaluation toolkit
- Full observability and monitoring
True Provider Independence
- Swap providers without application changes
- Mix and match best-in-class implementations
- Federation and fallback support
- No vendor lock-in
Our Philosophy
- Service-Oriented: REST APIs enforce clean interfaces and enable seamless transitions across different environments.
- Composability: Every component is independent but works together seamlessly
- Production Ready: Built for real-world applications, not just demos
- Turnkey Solutions: Easy to deploy built in solutions for popular deployment scenarios
- Llama First: Explicit focus on Meta's Llama models and partnering ecosystem
With Llama Stack, you can focus on building your application while we handle the infrastructure complexity, essential capabilities, and provider integrations.