mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-04 12:07:34 +00:00

History

Charlie Doern 49b729b30a feat: api level request metrics via middleware add RequestMetricsMiddleware which tracks key metrics related to each request the LLS server will recieve: 1. llama_stack_requests_total: tracks the total amount of requests the server has processed 2. llama_stack_request_duration_seconds: tracks the duration of each request 3. llama_stack_concurrent_requests: tracks concurrently processed requests by the server The usage of a middleware allows this to be done on the server level without having to add custom handling to each router like the inference router has today for its API specific metrics. Also, add some unit tests for this functionality resolves #2597 Signed-off-by: Charlie Doern <cdoern@redhat.com>		2025-08-03 13:14:25 -04:00
..
_static	fix: remove unused DPO parameters from schema and tests (#2988 )	2025-07-31 09:11:08 -07:00
notebooks	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
openapi_generator	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
resources	Several documentation fixes and fix link to API reference	2025-02-04 14:00:43 -08:00
source	feat: api level request metrics via middleware	2025-08-03 13:14:25 -04:00
zero_to_hero_guide	refactor: remove Conda support from Llama Stack (#2969 )	2025-08-02 15:52:59 -07:00
conftest.py	fix: sleep after notebook test	2025-03-23 14:03:35 -07:00
contbuild.sh	Fix broken links with docs	2024-11-22 20:42:17 -08:00
dog.jpg	Support for Llama3.2 models and Swift SDK (#98 )	2024-09-25 10:29:58 -07:00
getting_started.ipynb	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
getting_started_llama4.ipynb	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
getting_started_llama_api.ipynb	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
license_header.txt	Initial commit	2024-07-23 08:32:33 -07:00
make.bat	feat(pre-commit): enhance pre-commit hooks with additional checks (#2014 )	2025-04-30 11:35:49 -07:00
Makefile	first version of readthedocs (#278 )	2024-10-22 10:15:58 +05:30
original_rfc.md	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
quick_start.ipynb	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
README.md	feat: add auto-generated CI documentation pre-commit hook (#2890 )	2025-07-25 17:57:01 +02:00

README.md

Llama Stack Documentation

Here's a collection of comprehensive guides, examples, and resources for building AI applications with Llama Stack. For the complete documentation, visit our ReadTheDocs page.

Render locally

From the llama-stack root directory, run the following command to render the docs locally:

uv run --group docs sphinx-autobuild docs/source docs/build/html --write-all

You can open up the docs in your browser at http://localhost:8000

Content

Try out Llama Stack's capabilities through our detailed Jupyter notebooks:

Building AI Applications Notebook - A comprehensive guide to building production-ready AI applications using Llama Stack
Benchmark Evaluations Notebook - Detailed performance evaluations and benchmarking results
Zero-to-Hero Guide - Step-by-step guide for getting started with Llama Stack