mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 18:00:36 +00:00

History

Ashwin Bharambe 7c63aebd64 feat(responses)!: add reasoning and annotation added events (#3793 ) Implements missing streaming events from OpenAI Responses API spec: - reasoning text/summary events for o1/o3 models, - refusal events for safety moderation - annotation events for citations, - and file search streaming events. Added optional reasoning_content field to chat completion chunks to support non-standard provider extensions. NOTE: OpenAI does _not_ fill reasoning_content when users use the chat_completion APIs. This means there is no way for us to implement Responses (with reasoning) by using OpenAI chat completions! We'd need to transparently punt to OpenAI's responses endpoints if we wish to do that. For others though (vLLM, etc.) we can use it. ## Test Plan File search streaming test passes: ``` ./scripts/integration-tests.sh --stack-config server:ci-tests \ --suite responses --setup gpt --inference-mode replay --pattern test_response_file_search_streaming_events ``` Need more complex setup and validation for reasoning tests (need a vLLM powered OSS model maybe gpt-oss which can return reasoning_content). I will do that in a followup PR.		2025-10-11 16:47:14 -07:00
..
docs	feat: use SecretStr for inference provider auth credentials (#3724 )	2025-10-10 07:32:50 -07:00
notebooks	fix: update dangling references to llama download command (#3763 )	2025-10-09 18:35:02 -07:00
openapi_generator	chore: refactor (chat)completions endpoints to use shared params struct (#3761 )	2025-10-10 15:46:34 -07:00
src	docs: api separation (#3630 )	2025-10-01 10:13:31 -07:00
static	feat(responses)!: add reasoning and annotation added events (#3793 )	2025-10-11 16:47:14 -07:00
supplementary	docs: adding supplementary markdown content to API specs (#3632 )	2025-10-01 10:15:30 -07:00
zero_to_hero_guide	chore!: remove --env from `llama stack run` (#3711 )	2025-10-07 20:58:15 -07:00
docusaurus.config.ts	docs: add favicon and mobile styling (#3650 )	2025-10-02 10:42:54 +02:00
dog.jpg	Support for Llama3.2 models and Swift SDK (#98 )	2024-09-25 10:29:58 -07:00
getting_started.ipynb	chore!: remove --env from `llama stack run` (#3711 )	2025-10-07 20:58:15 -07:00
getting_started_llama4.ipynb	chore!: remove model mgmt from CLI for Hugging Face CLI (#3700 )	2025-10-09 16:50:33 -07:00
getting_started_llama_api.ipynb	chore!: remove --env from `llama stack run` (#3711 )	2025-10-07 20:58:15 -07:00
license_header.txt	Initial commit	2024-07-23 08:32:33 -07:00
original_rfc.md	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
package-lock.json	docs: docusaurus setup (#3541 )	2025-09-24 14:11:30 -07:00
package.json	docs: docusaurus setup (#3541 )	2025-09-24 14:11:30 -07:00
quick_start.ipynb	chore!: remove --env from `llama stack run` (#3711 )	2025-10-07 20:58:15 -07:00
README.md	docs: docusaurus setup (#3541 )	2025-09-24 14:11:30 -07:00
sidebars.ts	docs: Update docs navbar config (#3653 )	2025-10-02 16:48:38 +02:00
tsconfig.json	docs: docusaurus setup (#3541 )	2025-09-24 14:11:30 -07:00

README.md

Llama Stack Documentation

Here's a collection of comprehensive guides, examples, and resources for building AI applications with Llama Stack. For the complete documentation, visit our Github page.

Render locally

From the llama-stack docs/ directory, run the following commands to render the docs locally:

npm install
npm run gen-api-docs all
npm run build
npm run serve

You can open up the docs in your browser at http://localhost:3000

Content

Try out Llama Stack's capabilities through our detailed Jupyter notebooks:

Building AI Applications Notebook - A comprehensive guide to building production-ready AI applications using Llama Stack
Benchmark Evaluations Notebook - Detailed performance evaluations and benchmarking results
Zero-to-Hero Guide - Step-by-step guide for getting started with Llama Stack