mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 09:53:45 +00:00

History

ehhuang c5e2e269e2 Some checks failed Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 1s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Details Vector IO Integration Tests / test-matrix (push) Failing after 6s Details Pre-commit / pre-commit (push) Failing after 7s Details Test Llama Stack Build / build-single-provider (push) Failing after 6s Details Python Package Build Test / build (3.13) (push) Failing after 8s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 9s Details Python Package Build Test / build (3.12) (push) Failing after 9s Details Unit Tests / unit-tests (3.12) (push) Failing after 8s Details Test External API and Providers / test-external (venv) (push) Failing after 10s Details Update ReadTheDocs / update-readthedocs (push) Failing after 11s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s Details Unit Tests / unit-tests (3.13) (push) Failing after 12s Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 21s Details Test Llama Stack Build / generate-matrix (push) Failing after 21s Details Test Llama Stack Build / build (push) Has been skipped Details UI Tests / ui-tests (22) (push) Failing after 21s Details feat(api): introduce /rerank (#2940 ) # What does this PR do? Context: https://github.com/meta-llama/llama-stack/issues/2937 The API design is inspired by existing offerings, but not exactly the same: * `top_n` as the parameter to control number of results, instead of `top_k`, since `n` is conventional to control number * `truncation` bool instead of `max_token_per_doc`, since we should just handle the truncation automatically depending on model capability, instead of user setting the context length manually. * `data` field in the response, to be consistent with other OpenAI APIs (though they don't have a rerank API). Also, it is one less name to learn in the API. ## Test Plan Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>		2025-08-21 18:23:16 -07:00
..
_static	feat(api): introduce /rerank (#2940 )	2025-08-21 18:23:16 -07:00
notebooks	chore: rename templates to distributions (#3035 )	2025-08-04 11:34:17 -07:00
openapi_generator	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
resources	Several documentation fixes and fix link to API reference	2025-02-04 14:00:43 -08:00
source	feat: Remove initialize() Method from LlamaStackAsLibrary (#2979 )	2025-08-21 15:59:04 -07:00
zero_to_hero_guide	chore: rename templates to distributions (#3035 )	2025-08-04 11:34:17 -07:00
conftest.py	fix: sleep after notebook test	2025-03-23 14:03:35 -07:00
contbuild.sh	Fix broken links with docs	2024-11-22 20:42:17 -08:00
dog.jpg	Support for Llama3.2 models and Swift SDK (#98 )	2024-09-25 10:29:58 -07:00
getting_started.ipynb	chore: rename templates to distributions (#3035 )	2025-08-04 11:34:17 -07:00
getting_started_llama4.ipynb	chore: rename templates to distributions (#3035 )	2025-08-04 11:34:17 -07:00
getting_started_llama_api.ipynb	chore: rename templates to distributions (#3035 )	2025-08-04 11:34:17 -07:00
license_header.txt	Initial commit	2024-07-23 08:32:33 -07:00
make.bat	feat(pre-commit): enhance pre-commit hooks with additional checks (#2014 )	2025-04-30 11:35:49 -07:00
Makefile	first version of readthedocs (#278 )	2024-10-22 10:15:58 +05:30
original_rfc.md	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
quick_start.ipynb	chore: rename templates to distributions (#3035 )	2025-08-04 11:34:17 -07:00
README.md	feat: add auto-generated CI documentation pre-commit hook (#2890 )	2025-07-25 17:57:01 +02:00

README.md

Llama Stack Documentation

Here's a collection of comprehensive guides, examples, and resources for building AI applications with Llama Stack. For the complete documentation, visit our ReadTheDocs page.

Render locally

From the llama-stack root directory, run the following command to render the docs locally:

uv run --group docs sphinx-autobuild docs/source docs/build/html --write-all

You can open up the docs in your browser at http://localhost:8000

Content

Try out Llama Stack's capabilities through our detailed Jupyter notebooks:

Building AI Applications Notebook - A comprehensive guide to building production-ready AI applications using Llama Stack
Benchmark Evaluations Notebook - Detailed performance evaluations and benchmarking results
Zero-to-Hero Guide - Step-by-step guide for getting started with Llama Stack