llama-stack-mirror/docs/source/introduction/index.md
Hardik Shah 65f07c3d63
Update Documentation (#838)
# What does this PR do?

Update README and other documentation


## Before submitting

- [X] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
2025-01-22 20:38:52 -08:00

2.3 KiB

Why Llama Stack?

Building production AI applications today requires solving multiple challenges:

Infrastructure Complexity

  • Running large language models efficiently requires specialized infrastructure.
  • Different deployment scenarios (local development, cloud, edge) need different solutions.
  • Moving from development to production often requires significant rework.

Essential Capabilities

  • Safety guardrails and content filtering are necessary in an enterprise setting.
  • Just model inference is not enough - Knowledge retrieval and RAG capabilities are required.
  • Nearly any application needs composable multi-step workflows.
  • Finally, without monitoring, observability and evaluation, you end up operating in the dark.

Lack of Flexibility and Choice

  • Directly integrating with multiple providers creates tight coupling.
  • Different providers have different APIs and abstractions.
  • Changing providers requires significant code changes.

Our Solution: A Universal Stack

:alt: Llama Stack
:width: 400px

Llama Stack addresses these challenges through a service-oriented, API-first approach:

Develop Anywhere, Deploy Everywhere

  • Start locally with CPU-only setups
  • Move to GPU acceleration when needed
  • Deploy to cloud or edge without code changes
  • Same APIs and developer experience everywhere

Production-Ready Building Blocks

  • Pre-built safety guardrails and content filtering
  • Built-in RAG and agent capabilities
  • Comprehensive evaluation toolkit
  • Full observability and monitoring

True Provider Independence

  • Swap providers without application changes
  • Mix and match best-in-class implementations
  • Federation and fallback support
  • No vendor lock-in

Our Philosophy

  • Service-Oriented: REST APIs enforce clean interfaces and enable seamless transitions across different environments.
  • Composability: Every component is independent but works together seamlessly
  • Production Ready: Built for real-world applications, not just demos
  • Turnkey Solutions: Easy to deploy built in solutions for popular deployment scenarios
  • Llama First: Explicit focus on Meta's Llama models and partnering ecosystem

With Llama Stack, you can focus on building your application while we handle the infrastructure complexity, essential capabilities, and provider integrations.