llama-stack-mirror/docs
Ashwin Bharambe 7c63aebd64
feat(responses)!: add reasoning and annotation added events (#3793)
Implements missing streaming events from OpenAI Responses API spec: 
 - reasoning text/summary events for o1/o3 models, 
 - refusal events for safety moderation
 - annotation events for citations, 
 - and file search streaming events. 
 
Added optional reasoning_content field to chat completion chunks to
support non-standard provider extensions.

**NOTE:** OpenAI does _not_ fill reasoning_content when users use the
chat_completion APIs. This means there is no way for us to implement
Responses (with reasoning) by using OpenAI chat completions! We'd need
to transparently punt to OpenAI's responses endpoints if we wish to do
that. For others though (vLLM, etc.) we can use it.

## Test Plan

File search streaming test passes:
```
./scripts/integration-tests.sh --stack-config server:ci-tests \
   --suite responses --setup gpt --inference-mode replay --pattern test_response_file_search_streaming_events
```

Need more complex setup and validation for reasoning tests (need a vLLM
powered OSS model maybe gpt-oss which can return reasoning_content). I
will do that in a followup PR.
2025-10-11 16:47:14 -07:00
..
docs feat: use SecretStr for inference provider auth credentials (#3724) 2025-10-10 07:32:50 -07:00
notebooks fix: update dangling references to llama download command (#3763) 2025-10-09 18:35:02 -07:00
openapi_generator chore: refactor (chat)completions endpoints to use shared params struct (#3761) 2025-10-10 15:46:34 -07:00
src docs: api separation (#3630) 2025-10-01 10:13:31 -07:00
static feat(responses)!: add reasoning and annotation added events (#3793) 2025-10-11 16:47:14 -07:00
supplementary docs: adding supplementary markdown content to API specs (#3632) 2025-10-01 10:15:30 -07:00
zero_to_hero_guide chore!: remove --env from llama stack run (#3711) 2025-10-07 20:58:15 -07:00
docusaurus.config.ts docs: add favicon and mobile styling (#3650) 2025-10-02 10:42:54 +02:00
dog.jpg Support for Llama3.2 models and Swift SDK (#98) 2024-09-25 10:29:58 -07:00
getting_started.ipynb chore!: remove --env from llama stack run (#3711) 2025-10-07 20:58:15 -07:00
getting_started_llama4.ipynb chore!: remove model mgmt from CLI for Hugging Face CLI (#3700) 2025-10-09 16:50:33 -07:00
getting_started_llama_api.ipynb chore!: remove --env from llama stack run (#3711) 2025-10-07 20:58:15 -07:00
license_header.txt Initial commit 2024-07-23 08:32:33 -07:00
original_rfc.md chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
package-lock.json docs: docusaurus setup (#3541) 2025-09-24 14:11:30 -07:00
package.json docs: docusaurus setup (#3541) 2025-09-24 14:11:30 -07:00
quick_start.ipynb chore!: remove --env from llama stack run (#3711) 2025-10-07 20:58:15 -07:00
README.md docs: docusaurus setup (#3541) 2025-09-24 14:11:30 -07:00
sidebars.ts docs: Update docs navbar config (#3653) 2025-10-02 16:48:38 +02:00
tsconfig.json docs: docusaurus setup (#3541) 2025-09-24 14:11:30 -07:00

Llama Stack Documentation

Here's a collection of comprehensive guides, examples, and resources for building AI applications with Llama Stack. For the complete documentation, visit our Github page.

Render locally

From the llama-stack docs/ directory, run the following commands to render the docs locally:

npm install
npm run gen-api-docs all
npm run build
npm run serve

You can open up the docs in your browser at http://localhost:3000

Content

Try out Llama Stack's capabilities through our detailed Jupyter notebooks: