chore: update doc (#3857)

# What does this PR do?
follows https://github.com/llamastack/llama-stack/pull/3839

## Test Plan
This commit is contained in:
ehhuang 2025-10-20 10:33:21 -07:00 committed by GitHub
parent 21772de5d3
commit 359df3a37c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
25 changed files with 6380 additions and 6378 deletions

View file

@ -51,8 +51,8 @@ device: cpu
You can access the HuggingFace trainer via the `starter` distribution: You can access the HuggingFace trainer via the `starter` distribution:
```bash ```bash
llama stack build --distro starter --image-type venv llama stack list-deps starter | xargs -L1 uv pip install
llama stack run ~/.llama/distributions/starter/starter-run.yaml llama stack run starter
``` ```
### Usage Example ### Usage Example

View file

@ -175,8 +175,7 @@ llama-stack-client benchmarks register \
**1. Start the Llama Stack API Server** **1. Start the Llama Stack API Server**
```bash ```bash
# Build and run a distribution (example: together) llama stack list-deps together | xargs -L1 uv pip install
llama stack build --distro together --image-type venv
llama stack run together llama stack run together
``` ```
@ -209,7 +208,7 @@ The playground works with any Llama Stack distribution. Popular options include:
<TabItem value="together" label="Together AI"> <TabItem value="together" label="Together AI">
```bash ```bash
llama stack build --distro together --image-type venv llama stack list-deps together | xargs -L1 uv pip install
llama stack run together llama stack run together
``` ```
@ -222,7 +221,7 @@ llama stack run together
<TabItem value="ollama" label="Ollama (Local)"> <TabItem value="ollama" label="Ollama (Local)">
```bash ```bash
llama stack build --distro ollama --image-type venv llama stack list-deps ollama | xargs -L1 uv pip install
llama stack run ollama llama stack run ollama
``` ```
@ -235,7 +234,7 @@ llama stack run ollama
<TabItem value="meta-reference" label="Meta Reference"> <TabItem value="meta-reference" label="Meta Reference">
```bash ```bash
llama stack build --distro meta-reference --image-type venv llama stack list-deps meta-reference | xargs -L1 uv pip install
llama stack run meta-reference llama stack run meta-reference
``` ```

View file

@ -20,7 +20,8 @@ RAG enables your applications to reference and recall information from external
In one terminal, start the Llama Stack server: In one terminal, start the Llama Stack server:
```bash ```bash
uv run llama stack build --distro starter --image-type venv --run llama stack list-deps starter | xargs -L1 uv pip install
llama stack run starter
``` ```
### 2. Connect with OpenAI Client ### 2. Connect with OpenAI Client

View file

@ -67,7 +67,7 @@ def get_base_url(self) -> str:
## Testing the Provider ## Testing the Provider
Before running tests, you must have required dependencies installed. This depends on the providers or distributions you are testing. For example, if you are testing the `together` distribution, you should install dependencies via `llama stack build --distro together`. Before running tests, you must have required dependencies installed. This depends on the providers or distributions you are testing. For example, if you are testing the `together` distribution, install its dependencies with `llama stack list-deps together | xargs -L1 uv pip install`.
### 1. Integration Testing ### 1. Integration Testing

View file

@ -12,7 +12,7 @@ This avoids the overhead of setting up a server.
```bash ```bash
# setup # setup
uv pip install llama-stack uv pip install llama-stack
llama stack build --distro starter --image-type venv llama stack list-deps starter | xargs -L1 uv pip install
``` ```
```python ```python

View file

@ -59,7 +59,7 @@ Start a Llama Stack server on localhost. Here is an example of how you can do th
uv venv starter --python 3.12 uv venv starter --python 3.12
source starter/bin/activate # On Windows: starter\Scripts\activate source starter/bin/activate # On Windows: starter\Scripts\activate
pip install --no-cache llama-stack==0.2.2 pip install --no-cache llama-stack==0.2.2
llama stack build --distro starter --image-type venv llama stack list-deps starter | xargs -L1 uv pip install
export FIREWORKS_API_KEY=<SOME_KEY> export FIREWORKS_API_KEY=<SOME_KEY>
llama stack run starter --port 5050 llama stack run starter --port 5050
``` ```

View file

@ -166,10 +166,10 @@ docker run \
### Via venv ### Via venv
Make sure you have done `pip install llama-stack` and have the Llama Stack CLI available. Install the distribution dependencies before launching:
```bash ```bash
llama stack build --distro dell --image-type venv llama stack list-deps dell | xargs -L1 uv pip install
INFERENCE_MODEL=$INFERENCE_MODEL \ INFERENCE_MODEL=$INFERENCE_MODEL \
DEH_URL=$DEH_URL \ DEH_URL=$DEH_URL \
CHROMA_URL=$CHROMA_URL \ CHROMA_URL=$CHROMA_URL \

View file

@ -81,10 +81,10 @@ docker run \
### Via venv ### Via venv
Make sure you have done `uv pip install llama-stack` and have the Llama Stack CLI available. Make sure you have the Llama Stack CLI available.
```bash ```bash
llama stack build --distro meta-reference-gpu --image-type venv llama stack list-deps meta-reference-gpu | xargs -L1 uv pip install
INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct \ INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct \
llama stack run distributions/meta-reference-gpu/run.yaml \ llama stack run distributions/meta-reference-gpu/run.yaml \
--port 8321 --port 8321

View file

@ -136,11 +136,11 @@ docker run \
### Via venv ### Via venv
If you've set up your local development environment, you can also build the image using your local virtual environment. If you've set up your local development environment, you can also install the distribution dependencies using your local virtual environment.
```bash ```bash
INFERENCE_MODEL=meta-llama/Llama-3.1-8B-Instruct INFERENCE_MODEL=meta-llama/Llama-3.1-8B-Instruct
llama stack build --distro nvidia --image-type venv llama stack list-deps nvidia | xargs -L1 uv pip install
NVIDIA_API_KEY=$NVIDIA_API_KEY \ NVIDIA_API_KEY=$NVIDIA_API_KEY \
INFERENCE_MODEL=$INFERENCE_MODEL \ INFERENCE_MODEL=$INFERENCE_MODEL \
llama stack run ./run.yaml \ llama stack run ./run.yaml \

View file

@ -240,6 +240,6 @@ additional_pip_packages:
- sqlalchemy[asyncio] - sqlalchemy[asyncio]
``` ```
No other steps are required other than `llama stack build` and `llama stack run`. The build process will use `module` to install all of the provider dependencies, retrieve the spec, etc. No other steps are required beyond installing dependencies with `llama stack list-deps <distro> | xargs -L1 uv pip install` and then running `llama stack run`. The CLI will use `module` to install the provider dependencies, retrieve the spec, etc.
The provider will now be available in Llama Stack with the type `remote::ramalama`. The provider will now be available in Llama Stack with the type `remote::ramalama`.

View file

@ -123,7 +123,8 @@
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n", " del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
"\n", "\n",
"# this command installs all the dependencies needed for the llama stack server with the together inference provider\n", "# this command installs all the dependencies needed for the llama stack server with the together inference provider\n",
"!uv run --with llama-stack llama stack build --distro together\n", "!uv run --with llama-stack llama stack list-deps together | xargs -L1 uv pip install\n",
"!uv run --with llama-stack llama stack run together\n",
"\n", "\n",
"def run_llama_stack_server_background():\n", "def run_llama_stack_server_background():\n",
" log_file = open(\"llama_stack_server.log\", \"w\")\n", " log_file = open(\"llama_stack_server.log\", \"w\")\n",

View file

@ -233,7 +233,8 @@
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n", " del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
"\n", "\n",
"# this command installs all the dependencies needed for the llama stack server\n", "# this command installs all the dependencies needed for the llama stack server\n",
"!uv run --with llama-stack llama stack build --distro meta-reference-gpu\n", "!uv run --with llama-stack llama stack list-deps meta-reference-gpu | xargs -L1 uv pip install\n",
"!uv run --with llama-stack llama stack run meta-reference-gpu\n",
"\n", "\n",
"def run_llama_stack_server_background():\n", "def run_llama_stack_server_background():\n",
" log_file = open(\"llama_stack_server.log\", \"w\")\n", " log_file = open(\"llama_stack_server.log\", \"w\")\n",

View file

@ -223,7 +223,8 @@
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n", " del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
"\n", "\n",
"# this command installs all the dependencies needed for the llama stack server\n", "# this command installs all the dependencies needed for the llama stack server\n",
"!uv run --with llama-stack llama stack build --distro llama_api\n", "!uv run --with llama-stack llama stack list-deps llama_api | xargs -L1 uv pip install\n",
"!uv run --with llama-stack llama stack run llama_api\n",
"\n", "\n",
"def run_llama_stack_server_background():\n", "def run_llama_stack_server_background():\n",
" log_file = open(\"llama_stack_server.log\", \"w\")\n", " log_file = open(\"llama_stack_server.log\", \"w\")\n",

View file

@ -2864,7 +2864,7 @@
} }
], ],
"source": [ "source": [
"!llama stack build --distro experimental-post-training --image-type venv --image-name __system__" "!llama stack list-deps experimental-post-training | xargs -L1 uv pip install"
] ]
}, },
{ {

View file

@ -38,7 +38,7 @@
"source": [ "source": [
"# NBVAL_SKIP\n", "# NBVAL_SKIP\n",
"!pip install -U llama-stack\n", "!pip install -U llama-stack\n",
"!UV_SYSTEM_PYTHON=1 llama stack build --distro fireworks --image-type venv" "llama stack list-deps fireworks | xargs -L1 uv pip install\n"
] ]
}, },
{ {

View file

@ -57,7 +57,7 @@
"outputs": [], "outputs": [],
"source": [ "source": [
"# NBVAL_SKIP\n", "# NBVAL_SKIP\n",
"!UV_SYSTEM_PYTHON=1 llama stack build --distro together --image-type venv" "!uv run llama stack list-deps together | xargs -L1 uv pip install\n"
] ]
}, },
{ {

View file

@ -136,7 +136,8 @@
" \"\"\"Build and run LlamaStack server in one step using --run flag\"\"\"\n", " \"\"\"Build and run LlamaStack server in one step using --run flag\"\"\"\n",
" log_file = open(\"llama_stack_server.log\", \"w\")\n", " log_file = open(\"llama_stack_server.log\", \"w\")\n",
" process = subprocess.Popen(\n", " process = subprocess.Popen(\n",
" \"uv run --with llama-stack llama stack build --distro starter --image-type venv --run\",\n", " \"uv run --with llama-stack llama stack list-deps starter | xargs -L1 uv pip install\",\n",
" \"uv run --with llama-stack llama stack run starter\",\n",
" shell=True,\n", " shell=True,\n",
" stdout=log_file,\n", " stdout=log_file,\n",
" stderr=log_file,\n", " stderr=log_file,\n",
@ -172,7 +173,7 @@
"\n", "\n",
"def kill_llama_stack_server():\n", "def kill_llama_stack_server():\n",
" # Kill any existing llama stack server processes using pkill command\n", " # Kill any existing llama stack server processes using pkill command\n",
" os.system(\"pkill -f llama_stack.core.server.server\")" " os.system(\"pkill -f llama_stack.core.server.server\")\n"
] ]
}, },
{ {

View file

@ -105,7 +105,8 @@
" \"\"\"Build and run LlamaStack server in one step using --run flag\"\"\"\n", " \"\"\"Build and run LlamaStack server in one step using --run flag\"\"\"\n",
" log_file = open(\"llama_stack_server.log\", \"w\")\n", " log_file = open(\"llama_stack_server.log\", \"w\")\n",
" process = subprocess.Popen(\n", " process = subprocess.Popen(\n",
" \"uv run --with llama-stack llama stack build --distro starter --image-type venv --run\",\n", " \"uv run --with llama-stack llama stack list-deps starter | xargs -L1 uv pip install\",\n",
" \"uv run --with llama-stack llama stack run starter\",\n",
" shell=True,\n", " shell=True,\n",
" stdout=log_file,\n", " stdout=log_file,\n",
" stderr=log_file,\n", " stderr=log_file,\n",

View file

@ -92,7 +92,7 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"```bash\n", "```bash\n",
"LLAMA_STACK_DIR=$(pwd) llama stack build --distro nvidia --image-type venv\n", "uv run --with llama-stack llama stack list-deps nvidia | xargs -L1 uv pip install\n",
"```" "```"
] ]
}, },

View file

@ -81,7 +81,7 @@
"metadata": {}, "metadata": {},
"source": [ "source": [
"```bash\n", "```bash\n",
"LLAMA_STACK_DIR=$(pwd) llama stack build --distro nvidia --image-type venv\n", "uv run --with llama-stack llama stack list-deps nvidia | xargs -L1 uv pip install\n",
"```" "```"
] ]
}, },

View file

@ -145,7 +145,7 @@
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n", " del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
"\n", "\n",
"# this command installs all the dependencies needed for the llama stack server with the ollama inference provider\n", "# this command installs all the dependencies needed for the llama stack server with the ollama inference provider\n",
"!uv run --with llama-stack llama stack build --distro starter\n", "!uv run --with llama-stack llama stack list-deps starter | xargs -L1 uv pip install\n",
"\n", "\n",
"def run_llama_stack_server_background():\n", "def run_llama_stack_server_background():\n",
" log_file = open(\"llama_stack_server.log\", \"w\")\n", " log_file = open(\"llama_stack_server.log\", \"w\")\n",

View file

@ -47,11 +47,11 @@ function QuickStart() {
<pre><code>{`# Install uv and start Ollama <pre><code>{`# Install uv and start Ollama
ollama run llama3.2:3b --keepalive 60m ollama run llama3.2:3b --keepalive 60m
# Install server dependencies
uv run --with llama-stack llama stack list-deps starter | xargs -L1 uv pip install
# Run Llama Stack server # Run Llama Stack server
OLLAMA_URL=http://localhost:11434 \\ OLLAMA_URL=http://localhost:11434 uv run --with llama-stack llama stack run starter
uv run --with llama-stack \\
llama stack build --distro starter \\
--image-type venv --run
# Try the Python SDK # Try the Python SDK
from llama_stack_client import LlamaStackClient from llama_stack_client import LlamaStackClient

View file

@ -78,17 +78,14 @@ If you're looking for more specific topics, we have a [Zero to Hero Guide](#next
## Build, Configure, and Run Llama Stack ## Build, Configure, and Run Llama Stack
1. **Build the Llama Stack**: 1. **Install dependencies**:
Build the Llama Stack using the `starter` template:
```bash ```bash
uv run --with llama-stack llama stack build --distro starter --image-type venv llama stack list-deps starter | xargs -L1 uv pip install
``` ```
**Expected Output:**
2. **Start the distribution**:
```bash ```bash
... llama stack run starter
Build Successful!
You can find the newly-built template here: ~/.llama/distributions/starter/starter-run.yaml
You can run the new Llama Stack Distro via: uv run --with llama-stack llama stack run starter
``` ```
3. **Set the ENV variables by exporting them to the terminal**: 3. **Set the ENV variables by exporting them to the terminal**:

View file

@ -70,10 +70,10 @@ docker run \
### Via venv ### Via venv
Make sure you have done `uv pip install llama-stack` and have the Llama Stack CLI available. Make sure you have the Llama Stack CLI available.
```bash ```bash
llama stack build --distro {{ name }} --image-type venv llama stack list-deps meta-reference-gpu | xargs -L1 uv pip install
INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct \ INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct \
llama stack run distributions/{{ name }}/run.yaml \ llama stack run distributions/{{ name }}/run.yaml \
--port 8321 --port 8321

View file

@ -126,11 +126,11 @@ docker run \
### Via venv ### Via venv
If you've set up your local development environment, you can also build the image using your local virtual environment. If you've set up your local development environment, you can also install the distribution dependencies using your local virtual environment.
```bash ```bash
INFERENCE_MODEL=meta-llama/Llama-3.1-8B-Instruct INFERENCE_MODEL=meta-llama/Llama-3.1-8B-Instruct
llama stack build --distro nvidia --image-type venv llama stack list-deps nvidia | xargs -L1 uv pip install
NVIDIA_API_KEY=$NVIDIA_API_KEY \ NVIDIA_API_KEY=$NVIDIA_API_KEY \
INFERENCE_MODEL=$INFERENCE_MODEL \ INFERENCE_MODEL=$INFERENCE_MODEL \
llama stack run ./run.yaml \ llama stack run ./run.yaml \