Update cli_reference.md

Made it easier to follow along with numbered steps
2025-06-28 02:53:30 +00:00 · 2024-09-04 18:50:10 -07:00 · 2024-09-04 18:50:10 -07:00 · 225cd75074
commit 225cd75074
parent bfee50aa83
1 changed files with 13 additions and 25 deletions
--- a/docs/cli_reference.md
+++ b/docs/cli_reference.md
@ -5,7 +5,7 @@ The `llama` CLI tool helps you setup and use the Llama toolchain & agentic syste
 ### Subcommands
 1. `download`: `llama` cli tools supports downloading the model from Meta or HuggingFace.
 2. `model`: Lists available models and their properties.
-3. `stack`: Allows you to build and run a Llama Stack server. You can read more about this [here](https://github.com/meta-llama/llama-stack/blob/api_updates_1/docs/cli_reference.md#step-3-building-configuring-and-running-llama-stack-servers).
+3. `stack`: Allows you to build and run a Llama Stack server. You can read more about this [here](/docs/cli_reference.md#step-3-building-configuring-and-running-llama-stack-servers).

 ### Sample Usage

@ -13,7 +13,7 @@ The `llama` CLI tool helps you setup and use the Llama toolchain & agentic syste
 llama --help
 ```
 <pre style="font-family: monospace;">
-usage: llama [-h] {download,model,stack,api} ...
+usage: llama [-h] {download,model,stack} ...

 Welcome to the Llama CLI

@ -21,7 +21,7 @@ options:
  -h, --help            show this help message and exit

 subcommands:
-  {download,model,stack,api}
+  {download,model,stack}
 </pre>

 ## Step 1. Get the models
@ -236,28 +236,13 @@ These commands can help understand the model interface and how prompts / message
 **NOTE**: Outputs in terminal are color printed to show special tokens.


-## Step 3: Building, Configuring and Running Llama Stack servers
+## Step 3: Listing, Building, and Configuring Llama Stack Distributions

-An agentic app has several components including model inference, tool execution and system safety shields. Running all these components is made simpler (we hope!) with Llama Stack Distributions.

-The Llama Stack is a collection of REST APIs. An API is _implemented_ by Provider. An assembly of Providers together provides the implementation for the Stack -- this package is called a Distribution.
-
-As an example, by running a simple command `llama stack run`, you can bring up a server serving the following endpoints, among others:
-```
-POST /inference/chat_completion
-POST /inference/completion
-POST /safety/run_shields
-POST /agentic_system/create
-POST /agentic_system/session/create
-POST /agentic_system/turn/create
-POST /agentic_system/delete
-```
-
-The agentic app can now simply point to this server to execute all its needed components.
-
-Lets build, configure and start a Llama Stack server specified via a "Distribution ID" to understand more !
+### Step 3.1: List available distributions

 Let’s start with listing available distributions:
+
 ```
 llama stack list-distributions
 ```
@ -305,9 +290,7 @@ i+--------------------------------+---------------------------------------+-----

 As you can see above, each “distribution” details the “providers” it is composed of. For example, `local` uses the “meta-reference” provider for inference while local-ollama relies on a different provider (Ollama) for inference. Similarly, you can use Fireworks or Together.AI for running inference as well.

-To install a distribution, we run a simple command providing 2 inputs:
- **Distribution Id** of the distribution that we want to install ( as obtained from the list-distributions command )
- A **Name** for the specific build and configuration of this distribution.
+### Step 3.2: Build a distribution

 Let's imagine you are working with a 8B-Instruct model. The following command will build a package (in the form of a Conda environment) _and_ configure it. As part of the configuration, you will be asked for some inputs (model_id, max_seq_len, etc.) Since we are working with a 8B model, we will name our build `8b-instruct` to help us remember the config.

@ -330,6 +313,7 @@ Successfully setup conda environment. Configuring build...

 YAML configuration has been written to ~/.llama/builds/local/conda/8b-instruct.yaml
 ```
+### Step 3.3: Configure a distribution

 You can re-configure this distribution by running:
 ```
@ -372,7 +356,9 @@ Note that all configurations as well as models are stored in `~/.llama`

 ## Step 4: Starting a Llama Stack Distribution and Testing it

-Now let’s start Llama Stack server.
+### Step 4.1: Starting a distribution
+
+Now let’s start Llama Stack Distribution Server.

 You need the YAML configuration file which was written out at the end by the `llama stack build` step.

@ -421,6 +407,8 @@ INFO:     Uvicorn running on http://[::]:5000 (Press CTRL+C to quit)

 This server is running a Llama model locally.

+### Step 4.2: Test the distribution
+
 Lets test with a client.
 ```
 cd /path/to/llama-stack