docs: Reorganize documentation on the webpage (#2651)

# What does this PR do? Reorganizes the Llama stack webpage into more concise index pages, introduce more of a workflow, and reduce repetition of content. New nav structure so far based on #2637 Further discussions in https://github.com/meta-llama/llama-stack/discussions/2585 **Preview:** ![Screenshot 2025-07-09 at 2 31 53 PM](https://github.com/user-attachments/assets/4c1f3845-b328-4f12-9f20-3f09375007af) You can also build a full local preview locally **Feedback** Looking for feedback on page titles and general feedback on the new structure **Follow up documentation** I plan on reducing some sections and standardizing some terminology in a follow up PR. More discussions on that in https://github.com/meta-llama/llama-stack/discussions/2585
2025-07-18 10:52:28 +00:00 · 2025-07-15 17:19:35 -04:00 · 2025-07-15 17:19:35 -04:00 · b096794959
commit b096794959
parent e1755d1ed2
34 changed files with 487 additions and 249 deletions
--- a/docs/source/getting_started/detailed_tutorial.md
+++ b/docs/source/getting_started/detailed_tutorial.md
@ -1,4 +1,4 @@
-# Detailed Tutorial
+## Detailed Tutorial

 In this guide, we'll walk through how you can use the Llama Stack (server and client SDK) to test a simple agent.
 A Llama Stack agent is a simple integrated system that can perform tasks by combining a Llama model for reasoning with
@ -10,7 +10,7 @@ Llama Stack is a stateful service with REST APIs to support seamless transition
 In this guide, we'll walk through how to build a RAG agent locally using Llama Stack with [Ollama](https://ollama.com/)
 as the inference [provider](../providers/index.md#inference) for a Llama Model.

-## Step 1: Installation and Setup
+### Step 1: Installation and Setup

 Install Ollama by following the instructions on the [Ollama website](https://ollama.com/download), then
 download Llama 3.2 3B model, and then start the Ollama service.
@ -45,7 +45,7 @@ Setup your virtual environment.
 uv sync --python 3.12
 source .venv/bin/activate
 ```
-## Step 2:  Run Llama Stack
+### Step 2:  Run Llama Stack
 Llama Stack is a server that exposes multiple APIs, you connect with it using the Llama Stack client SDK.

 ::::{tab-set}
@ -132,7 +132,7 @@ Now you can use the Llama Stack client to run inference and build agents!
 You can reuse the server setup or use the [Llama Stack Client](https://github.com/meta-llama/llama-stack-client-python/).
 Note that the client package is already included in the `llama-stack` package.

-## Step 3: Run Client CLI
+### Step 3: Run Client CLI

 Open a new terminal and navigate to the same directory you started the server from. Then set up a new or activate your
 existing server virtual environment.
@ -232,7 +232,7 @@ OpenAIChatCompletion(
 )
 ```

-## Step 4: Run the Demos
+### Step 4: Run the Demos

 Note that these demos show the [Python Client SDK](../references/python_sdk_reference/index.md).
 Other SDKs are also available, please refer to the [Client SDK](../index.md#client-sdks) list for the complete options.
@ -242,7 +242,7 @@ Other SDKs are also available, please refer to the [Client SDK](../index.md#clie
 :::{tab-item} Basic Inference
 Now you can run inference using the Llama Stack client SDK.

-### i. Create the Script
+#### i. Create the Script

 Create a file `inference.py` and add the following code:
 ```python
@ -269,7 +269,7 @@ response = client.chat.completions.create(
 print(response)
 ```

-### ii. Run the Script
+#### ii. Run the Script
 Let's run the script using `uv`
 ```bash
 uv run python inference.py
@ -283,7 +283,7 @@ OpenAIChatCompletion(id='chatcmpl-30cd0f28-a2ad-4b6d-934b-13707fc60ebf', choices

 :::{tab-item} Build a Simple Agent
 Next we can move beyond simple inference and build an agent that can perform tasks using the Llama Stack server.
-### i. Create the Script
+#### i. Create the Script
 Create a file `agent.py` and add the following code:

 ```python
@ -455,7 +455,7 @@ uv run python agent.py

 For our last demo, we can build a RAG agent that can answer questions about the Torchtune project using the documents
 in a vector database.
-### i. Create the Script
+#### i. Create the Script
 Create a file `rag_agent.py` and add the following code:

 ```python
@ -533,7 +533,7 @@ for t in turns:
    for event in AgentEventLogger().log(stream):
        event.print()
 ```
-### ii. Run the Script
+#### ii. Run the Script
 Let's run the script using `uv`
 ```bash
 uv run python rag_agent.py