Bumps [actions/cache](https://github.com/actions/cache) from 4.3.0 to 5.0.1. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/actions/cache/releases">actions/cache's releases</a>.</em></p> <blockquote> <h2>v5.0.1</h2> <blockquote> <p>[!IMPORTANT] <strong><code>actions/cache@v5</code> runs on the Node.js 24 runtime and requires a minimum Actions Runner version of <code>2.327.1</code>.</strong></p> <p>If you are using self-hosted runners, ensure they are updated before upgrading.</p> </blockquote> <hr /> <h1>v5.0.1</h1> <h2>What's Changed</h2> <ul> <li>fix: update <code>@actions/cache</code> for Node.js 24 punycode deprecation by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/cache/pull/1685">actions/cache#1685</a></li> <li>prepare release v5.0.1 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/cache/pull/1686">actions/cache#1686</a></li> </ul> <h1>v5.0.0</h1> <h2>What's Changed</h2> <ul> <li>Upgrade to use node24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/cache/pull/1630">actions/cache#1630</a></li> <li>Prepare v5.0.0 release by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/cache/pull/1684">actions/cache#1684</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/cache/compare/v5...v5.0.1">https://github.com/actions/cache/compare/v5...v5.0.1</a></p> <h2>v5.0.0</h2> <blockquote> <p>[!IMPORTANT] <strong><code>actions/cache@v5</code> runs on the Node.js 24 runtime and requires a minimum Actions Runner version of <code>2.327.1</code>.</strong></p> <p>If you are using self-hosted runners, ensure they are updated before upgrading.</p> </blockquote> <hr /> <h2>What's Changed</h2> <ul> <li>Upgrade to use node24 by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/cache/pull/1630">actions/cache#1630</a></li> <li>Prepare v5.0.0 release by <a href="https://github.com/salmanmkc"><code>@salmanmkc</code></a> in <a href="https://redirect.github.com/actions/cache/pull/1684">actions/cache#1684</a></li> </ul> <p><strong>Full Changelog</strong>: <a href="https://github.com/actions/cache/compare/v4.3.0...v5.0.0">https://github.com/actions/cache/compare/v4.3.0...v5.0.0</a></p> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/actions/cache/blob/main/RELEASES.md">actions/cache's changelog</a>.</em></p> <blockquote> <h1>Releases</h1> <h2>Changelog</h2> <h3>5.0.1</h3> <ul> <li>Update <code>@azure/storage-blob</code> to <code>^12.29.1</code> via <code>@actions/cache@5.0.1</code> <a href="https://redirect.github.com/actions/cache/pull/1685">#1685</a></li> </ul> <h3>5.0.0</h3> <blockquote> <p>[!IMPORTANT] <code>actions/cache@v5</code> runs on the Node.js 24 runtime and requires a minimum Actions Runner version of <code>2.327.1</code>. If you are using self-hosted runners, ensure they are updated before upgrading.</p> </blockquote> <h3>4.3.0</h3> <ul> <li>Bump <code>@actions/cache</code> to <a href="https://redirect.github.com/actions/toolkit/pull/2132">v4.1.0</a></li> </ul> <h3>4.2.4</h3> <ul> <li>Bump <code>@actions/cache</code> to v4.0.5</li> </ul> <h3>4.2.3</h3> <ul> <li>Bump <code>@actions/cache</code> to v4.0.3 (obfuscates SAS token in debug logs for cache entries)</li> </ul> <h3>4.2.2</h3> <ul> <li>Bump <code>@actions/cache</code> to v4.0.2</li> </ul> <h3>4.2.1</h3> <ul> <li>Bump <code>@actions/cache</code> to v4.0.1</li> </ul> <h3>4.2.0</h3> <p>TLDR; The cache backend service has been rewritten from the ground up for improved performance and reliability. <a href="https://github.com/actions/cache">actions/cache</a> now integrates with the new cache service (v2) APIs.</p> <p>The new service will gradually roll out as of <strong>February 1st, 2025</strong>. The legacy service will also be sunset on the same date. Changes in these release are <strong>fully backward compatible</strong>.</p> <p><strong>We are deprecating some versions of this action</strong>. We recommend upgrading to version <code>v4</code> or <code>v3</code> as soon as possible before <strong>February 1st, 2025.</strong> (Upgrade instructions below).</p> <p>If you are using pinned SHAs, please use the SHAs of versions <code>v4.2.0</code> or <code>v3.4.0</code></p> <p>If you do not upgrade, all workflow runs using any of the deprecated <a href="https://github.com/actions/cache">actions/cache</a> will fail.</p> <p>Upgrading to the recommended versions will not break your workflows.</p> <h3>4.1.2</h3> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href=" |
||
|---|---|---|
| .github | ||
| benchmarking/k8s-benchmark | ||
| client-sdks/stainless | ||
| containers | ||
| docs | ||
| scripts | ||
| src | ||
| tests | ||
| .coveragerc | ||
| .dockerignore | ||
| .gitattributes | ||
| .gitignore | ||
| .pre-commit-config.yaml | ||
| CHANGELOG.md | ||
| CODE_OF_CONDUCT.md | ||
| CONTRIBUTING.md | ||
| coverage.svg | ||
| LICENSE | ||
| MANIFEST.in | ||
| pyproject.toml | ||
| README.md | ||
| SECURITY.md | ||
| uv.lock | ||
Llama Stack
Quick Start | Documentation | Colab Notebook | Discord
🚀 One-Line Installer 🚀
To try Llama Stack locally, run:
curl -LsSf https://github.com/llamastack/llama-stack/raw/main/scripts/install.sh | bash
Overview
Llama Stack defines and standardizes the core building blocks that simplify AI application development. It provides a unified set of APIs with implementations from leading service providers. More specifically, it provides:
- Unified API layer for Inference, RAG, Agents, Tools, Safety, Evals.
- Plugin architecture to support the rich ecosystem of different API implementations in various environments, including local development, on-premises, cloud, and mobile.
- Prepackaged verified distributions which offer a one-stop solution for developers to get started quickly and reliably in any environment.
- Multiple developer interfaces like CLI and SDKs for Python, Typescript, iOS, and Android.
- Standalone applications as examples for how to build production-grade AI applications with Llama Stack.
Llama Stack Benefits
- Flexibility: Developers can choose their preferred infrastructure without changing APIs and enjoy flexible deployment choices.
- Consistent Experience: With its unified APIs, Llama Stack makes it easier to build, test, and deploy AI applications with consistent application behavior.
- Robust Ecosystem: Llama Stack is integrated with distribution partners (cloud providers, hardware vendors, and AI-focused companies) that offer tailored infrastructure, software, and services for deploying Llama models.
For more information, see the Benefits of Llama Stack documentation.
API Providers
Here is a list of the various API providers and available distributions that can help developers get started easily with Llama Stack. Please checkout for full list
| API Provider | Environments | Agents | Inference | VectorIO | Safety | Post Training | Eval | DatasetIO |
|---|---|---|---|---|---|---|---|---|
| Meta Reference | Single Node | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| SambaNova | Hosted | ✅ | ✅ | |||||
| Cerebras | Hosted | ✅ | ||||||
| Fireworks | Hosted | ✅ | ✅ | ✅ | ||||
| AWS Bedrock | Hosted | ✅ | ✅ | |||||
| Together | Hosted | ✅ | ✅ | ✅ | ||||
| Groq | Hosted | ✅ | ||||||
| Ollama | Single Node | ✅ | ||||||
| TGI | Hosted/Single Node | ✅ | ||||||
| NVIDIA NIM | Hosted/Single Node | ✅ | ✅ | |||||
| ChromaDB | Hosted/Single Node | ✅ | ||||||
| Milvus | Hosted/Single Node | ✅ | ||||||
| Qdrant | Hosted/Single Node | ✅ | ||||||
| Weaviate | Hosted/Single Node | ✅ | ||||||
| SQLite-vec | Single Node | ✅ | ||||||
| PG Vector | Single Node | ✅ | ||||||
| PyTorch ExecuTorch | On-device iOS | ✅ | ✅ | |||||
| vLLM | Single Node | ✅ | ||||||
| OpenAI | Hosted | ✅ | ||||||
| Anthropic | Hosted | ✅ | ||||||
| Gemini | Hosted | ✅ | ||||||
| WatsonX | Hosted | ✅ | ||||||
| HuggingFace | Single Node | ✅ | ✅ | |||||
| TorchTune | Single Node | ✅ | ||||||
| NVIDIA NEMO | Hosted | ✅ | ✅ | ✅ | ✅ | ✅ | ||
| NVIDIA | Hosted | ✅ | ✅ | ✅ |
Note
: Additional providers are available through external packages. See External Providers documentation.
Distributions
A Llama Stack Distribution (or "distro") is a pre-configured bundle of provider implementations for each API component. Distributions make it easy to get started with a specific deployment scenario. For example, you can begin with a local setup of Ollama and seamlessly transition to production, with fireworks, without changing your application code. Here are some of the distributions we support:
| Distribution | Llama Stack Docker | Start This Distribution |
|---|---|---|
| Starter Distribution | llamastack/distribution-starter | Guide |
| Meta Reference | llamastack/distribution-meta-reference-gpu | Guide |
| PostgreSQL | llamastack/distribution-postgres-demo |
For full documentation on the Llama Stack distributions see the Distributions Overview page.
Documentation
Please checkout our Documentation page for more details.
- CLI references
- llama (server-side) CLI Reference: Guide for using the
llamaCLI to work with Llama models (download, study prompts), and building/starting a Llama Stack distribution. - llama (client-side) CLI Reference: Guide for using the
llama-stack-clientCLI, which allows you to query information about the distribution.
- llama (server-side) CLI Reference: Guide for using the
- Getting Started
- Quick guide to start a Llama Stack server.
- Jupyter notebook to walk-through how to use simple text and vision inference llama_stack_client APIs
- The complete Llama Stack lesson Colab notebook of the new Llama 3.2 course on Deeplearning.ai.
- A Zero-to-Hero Guide that guide you through all the key components of llama stack with code samples.
- Contributing
- Adding a new API Provider to walk-through how to add a new API provider.
Llama Stack Client SDKs
Check out our client SDKs for connecting to a Llama Stack server in your preferred language.
| Language | Client SDK | Package |
|---|---|---|
| Python | llama-stack-client-python | |
| Swift | llama-stack-client-swift | |
| Typescript | llama-stack-client-typescript | |
| Kotlin | llama-stack-client-kotlin |
You can find more example scripts with client SDKs to talk with the Llama Stack server in our llama-stack-apps repo.
🌟 GitHub Star History
Star History
✨ Contributors
Thanks to all of our amazing contributors!