mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-04 10:10:36 +00:00
chore: update doc (#3857)
# What does this PR do? follows https://github.com/llamastack/llama-stack/pull/3839 ## Test Plan
This commit is contained in:
parent
21772de5d3
commit
359df3a37c
25 changed files with 6380 additions and 6378 deletions
|
|
@ -51,8 +51,8 @@ device: cpu
|
||||||
You can access the HuggingFace trainer via the `starter` distribution:
|
You can access the HuggingFace trainer via the `starter` distribution:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
llama stack build --distro starter --image-type venv
|
llama stack list-deps starter | xargs -L1 uv pip install
|
||||||
llama stack run ~/.llama/distributions/starter/starter-run.yaml
|
llama stack run starter
|
||||||
```
|
```
|
||||||
|
|
||||||
### Usage Example
|
### Usage Example
|
||||||
|
|
|
||||||
|
|
@ -175,8 +175,7 @@ llama-stack-client benchmarks register \
|
||||||
**1. Start the Llama Stack API Server**
|
**1. Start the Llama Stack API Server**
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Build and run a distribution (example: together)
|
llama stack list-deps together | xargs -L1 uv pip install
|
||||||
llama stack build --distro together --image-type venv
|
|
||||||
llama stack run together
|
llama stack run together
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -209,7 +208,7 @@ The playground works with any Llama Stack distribution. Popular options include:
|
||||||
<TabItem value="together" label="Together AI">
|
<TabItem value="together" label="Together AI">
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
llama stack build --distro together --image-type venv
|
llama stack list-deps together | xargs -L1 uv pip install
|
||||||
llama stack run together
|
llama stack run together
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -222,7 +221,7 @@ llama stack run together
|
||||||
<TabItem value="ollama" label="Ollama (Local)">
|
<TabItem value="ollama" label="Ollama (Local)">
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
llama stack build --distro ollama --image-type venv
|
llama stack list-deps ollama | xargs -L1 uv pip install
|
||||||
llama stack run ollama
|
llama stack run ollama
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -235,7 +234,7 @@ llama stack run ollama
|
||||||
<TabItem value="meta-reference" label="Meta Reference">
|
<TabItem value="meta-reference" label="Meta Reference">
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
llama stack build --distro meta-reference --image-type venv
|
llama stack list-deps meta-reference | xargs -L1 uv pip install
|
||||||
llama stack run meta-reference
|
llama stack run meta-reference
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -20,7 +20,8 @@ RAG enables your applications to reference and recall information from external
|
||||||
In one terminal, start the Llama Stack server:
|
In one terminal, start the Llama Stack server:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
uv run llama stack build --distro starter --image-type venv --run
|
llama stack list-deps starter | xargs -L1 uv pip install
|
||||||
|
llama stack run starter
|
||||||
```
|
```
|
||||||
|
|
||||||
### 2. Connect with OpenAI Client
|
### 2. Connect with OpenAI Client
|
||||||
|
|
|
||||||
|
|
@ -67,7 +67,7 @@ def get_base_url(self) -> str:
|
||||||
|
|
||||||
## Testing the Provider
|
## Testing the Provider
|
||||||
|
|
||||||
Before running tests, you must have required dependencies installed. This depends on the providers or distributions you are testing. For example, if you are testing the `together` distribution, you should install dependencies via `llama stack build --distro together`.
|
Before running tests, you must have required dependencies installed. This depends on the providers or distributions you are testing. For example, if you are testing the `together` distribution, install its dependencies with `llama stack list-deps together | xargs -L1 uv pip install`.
|
||||||
|
|
||||||
### 1. Integration Testing
|
### 1. Integration Testing
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -12,7 +12,7 @@ This avoids the overhead of setting up a server.
|
||||||
```bash
|
```bash
|
||||||
# setup
|
# setup
|
||||||
uv pip install llama-stack
|
uv pip install llama-stack
|
||||||
llama stack build --distro starter --image-type venv
|
llama stack list-deps starter | xargs -L1 uv pip install
|
||||||
```
|
```
|
||||||
|
|
||||||
```python
|
```python
|
||||||
|
|
|
||||||
|
|
@ -59,7 +59,7 @@ Start a Llama Stack server on localhost. Here is an example of how you can do th
|
||||||
uv venv starter --python 3.12
|
uv venv starter --python 3.12
|
||||||
source starter/bin/activate # On Windows: starter\Scripts\activate
|
source starter/bin/activate # On Windows: starter\Scripts\activate
|
||||||
pip install --no-cache llama-stack==0.2.2
|
pip install --no-cache llama-stack==0.2.2
|
||||||
llama stack build --distro starter --image-type venv
|
llama stack list-deps starter | xargs -L1 uv pip install
|
||||||
export FIREWORKS_API_KEY=<SOME_KEY>
|
export FIREWORKS_API_KEY=<SOME_KEY>
|
||||||
llama stack run starter --port 5050
|
llama stack run starter --port 5050
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -166,10 +166,10 @@ docker run \
|
||||||
|
|
||||||
### Via venv
|
### Via venv
|
||||||
|
|
||||||
Make sure you have done `pip install llama-stack` and have the Llama Stack CLI available.
|
Install the distribution dependencies before launching:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
llama stack build --distro dell --image-type venv
|
llama stack list-deps dell | xargs -L1 uv pip install
|
||||||
INFERENCE_MODEL=$INFERENCE_MODEL \
|
INFERENCE_MODEL=$INFERENCE_MODEL \
|
||||||
DEH_URL=$DEH_URL \
|
DEH_URL=$DEH_URL \
|
||||||
CHROMA_URL=$CHROMA_URL \
|
CHROMA_URL=$CHROMA_URL \
|
||||||
|
|
|
||||||
|
|
@ -81,10 +81,10 @@ docker run \
|
||||||
|
|
||||||
### Via venv
|
### Via venv
|
||||||
|
|
||||||
Make sure you have done `uv pip install llama-stack` and have the Llama Stack CLI available.
|
Make sure you have the Llama Stack CLI available.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
llama stack build --distro meta-reference-gpu --image-type venv
|
llama stack list-deps meta-reference-gpu | xargs -L1 uv pip install
|
||||||
INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct \
|
INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct \
|
||||||
llama stack run distributions/meta-reference-gpu/run.yaml \
|
llama stack run distributions/meta-reference-gpu/run.yaml \
|
||||||
--port 8321
|
--port 8321
|
||||||
|
|
|
||||||
|
|
@ -136,11 +136,11 @@ docker run \
|
||||||
|
|
||||||
### Via venv
|
### Via venv
|
||||||
|
|
||||||
If you've set up your local development environment, you can also build the image using your local virtual environment.
|
If you've set up your local development environment, you can also install the distribution dependencies using your local virtual environment.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
INFERENCE_MODEL=meta-llama/Llama-3.1-8B-Instruct
|
INFERENCE_MODEL=meta-llama/Llama-3.1-8B-Instruct
|
||||||
llama stack build --distro nvidia --image-type venv
|
llama stack list-deps nvidia | xargs -L1 uv pip install
|
||||||
NVIDIA_API_KEY=$NVIDIA_API_KEY \
|
NVIDIA_API_KEY=$NVIDIA_API_KEY \
|
||||||
INFERENCE_MODEL=$INFERENCE_MODEL \
|
INFERENCE_MODEL=$INFERENCE_MODEL \
|
||||||
llama stack run ./run.yaml \
|
llama stack run ./run.yaml \
|
||||||
|
|
|
||||||
|
|
@ -240,6 +240,6 @@ additional_pip_packages:
|
||||||
- sqlalchemy[asyncio]
|
- sqlalchemy[asyncio]
|
||||||
```
|
```
|
||||||
|
|
||||||
No other steps are required other than `llama stack build` and `llama stack run`. The build process will use `module` to install all of the provider dependencies, retrieve the spec, etc.
|
No other steps are required beyond installing dependencies with `llama stack list-deps <distro> | xargs -L1 uv pip install` and then running `llama stack run`. The CLI will use `module` to install the provider dependencies, retrieve the spec, etc.
|
||||||
|
|
||||||
The provider will now be available in Llama Stack with the type `remote::ramalama`.
|
The provider will now be available in Llama Stack with the type `remote::ramalama`.
|
||||||
|
|
|
||||||
|
|
@ -123,7 +123,8 @@
|
||||||
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
|
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# this command installs all the dependencies needed for the llama stack server with the together inference provider\n",
|
"# this command installs all the dependencies needed for the llama stack server with the together inference provider\n",
|
||||||
"!uv run --with llama-stack llama stack build --distro together\n",
|
"!uv run --with llama-stack llama stack list-deps together | xargs -L1 uv pip install\n",
|
||||||
|
"!uv run --with llama-stack llama stack run together\n",
|
||||||
"\n",
|
"\n",
|
||||||
"def run_llama_stack_server_background():\n",
|
"def run_llama_stack_server_background():\n",
|
||||||
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
||||||
|
|
|
||||||
|
|
@ -233,7 +233,8 @@
|
||||||
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
|
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# this command installs all the dependencies needed for the llama stack server\n",
|
"# this command installs all the dependencies needed for the llama stack server\n",
|
||||||
"!uv run --with llama-stack llama stack build --distro meta-reference-gpu\n",
|
"!uv run --with llama-stack llama stack list-deps meta-reference-gpu | xargs -L1 uv pip install\n",
|
||||||
|
"!uv run --with llama-stack llama stack run meta-reference-gpu\n",
|
||||||
"\n",
|
"\n",
|
||||||
"def run_llama_stack_server_background():\n",
|
"def run_llama_stack_server_background():\n",
|
||||||
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
||||||
|
|
|
||||||
|
|
@ -223,7 +223,8 @@
|
||||||
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
|
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# this command installs all the dependencies needed for the llama stack server\n",
|
"# this command installs all the dependencies needed for the llama stack server\n",
|
||||||
"!uv run --with llama-stack llama stack build --distro llama_api\n",
|
"!uv run --with llama-stack llama stack list-deps llama_api | xargs -L1 uv pip install\n",
|
||||||
|
"!uv run --with llama-stack llama stack run llama_api\n",
|
||||||
"\n",
|
"\n",
|
||||||
"def run_llama_stack_server_background():\n",
|
"def run_llama_stack_server_background():\n",
|
||||||
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
||||||
|
|
|
||||||
|
|
@ -2864,7 +2864,7 @@
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
"source": [
|
"source": [
|
||||||
"!llama stack build --distro experimental-post-training --image-type venv --image-name __system__"
|
"!llama stack list-deps experimental-post-training | xargs -L1 uv pip install"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
|
|
||||||
|
|
@ -38,7 +38,7 @@
|
||||||
"source": [
|
"source": [
|
||||||
"# NBVAL_SKIP\n",
|
"# NBVAL_SKIP\n",
|
||||||
"!pip install -U llama-stack\n",
|
"!pip install -U llama-stack\n",
|
||||||
"!UV_SYSTEM_PYTHON=1 llama stack build --distro fireworks --image-type venv"
|
"llama stack list-deps fireworks | xargs -L1 uv pip install\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
|
|
||||||
|
|
@ -57,7 +57,7 @@
|
||||||
"outputs": [],
|
"outputs": [],
|
||||||
"source": [
|
"source": [
|
||||||
"# NBVAL_SKIP\n",
|
"# NBVAL_SKIP\n",
|
||||||
"!UV_SYSTEM_PYTHON=1 llama stack build --distro together --image-type venv"
|
"!uv run llama stack list-deps together | xargs -L1 uv pip install\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
|
|
||||||
|
|
@ -136,7 +136,8 @@
|
||||||
" \"\"\"Build and run LlamaStack server in one step using --run flag\"\"\"\n",
|
" \"\"\"Build and run LlamaStack server in one step using --run flag\"\"\"\n",
|
||||||
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
||||||
" process = subprocess.Popen(\n",
|
" process = subprocess.Popen(\n",
|
||||||
" \"uv run --with llama-stack llama stack build --distro starter --image-type venv --run\",\n",
|
" \"uv run --with llama-stack llama stack list-deps starter | xargs -L1 uv pip install\",\n",
|
||||||
|
" \"uv run --with llama-stack llama stack run starter\",\n",
|
||||||
" shell=True,\n",
|
" shell=True,\n",
|
||||||
" stdout=log_file,\n",
|
" stdout=log_file,\n",
|
||||||
" stderr=log_file,\n",
|
" stderr=log_file,\n",
|
||||||
|
|
@ -172,7 +173,7 @@
|
||||||
"\n",
|
"\n",
|
||||||
"def kill_llama_stack_server():\n",
|
"def kill_llama_stack_server():\n",
|
||||||
" # Kill any existing llama stack server processes using pkill command\n",
|
" # Kill any existing llama stack server processes using pkill command\n",
|
||||||
" os.system(\"pkill -f llama_stack.core.server.server\")"
|
" os.system(\"pkill -f llama_stack.core.server.server\")\n"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
|
|
||||||
|
|
@ -105,7 +105,8 @@
|
||||||
" \"\"\"Build and run LlamaStack server in one step using --run flag\"\"\"\n",
|
" \"\"\"Build and run LlamaStack server in one step using --run flag\"\"\"\n",
|
||||||
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
||||||
" process = subprocess.Popen(\n",
|
" process = subprocess.Popen(\n",
|
||||||
" \"uv run --with llama-stack llama stack build --distro starter --image-type venv --run\",\n",
|
" \"uv run --with llama-stack llama stack list-deps starter | xargs -L1 uv pip install\",\n",
|
||||||
|
" \"uv run --with llama-stack llama stack run starter\",\n",
|
||||||
" shell=True,\n",
|
" shell=True,\n",
|
||||||
" stdout=log_file,\n",
|
" stdout=log_file,\n",
|
||||||
" stderr=log_file,\n",
|
" stderr=log_file,\n",
|
||||||
|
|
|
||||||
|
|
@ -92,7 +92,7 @@
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"```bash\n",
|
"```bash\n",
|
||||||
"LLAMA_STACK_DIR=$(pwd) llama stack build --distro nvidia --image-type venv\n",
|
"uv run --with llama-stack llama stack list-deps nvidia | xargs -L1 uv pip install\n",
|
||||||
"```"
|
"```"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
|
|
||||||
|
|
@ -81,7 +81,7 @@
|
||||||
"metadata": {},
|
"metadata": {},
|
||||||
"source": [
|
"source": [
|
||||||
"```bash\n",
|
"```bash\n",
|
||||||
"LLAMA_STACK_DIR=$(pwd) llama stack build --distro nvidia --image-type venv\n",
|
"uv run --with llama-stack llama stack list-deps nvidia | xargs -L1 uv pip install\n",
|
||||||
"```"
|
"```"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
|
|
||||||
|
|
@ -145,7 +145,7 @@
|
||||||
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
|
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
|
||||||
"\n",
|
"\n",
|
||||||
"# this command installs all the dependencies needed for the llama stack server with the ollama inference provider\n",
|
"# this command installs all the dependencies needed for the llama stack server with the ollama inference provider\n",
|
||||||
"!uv run --with llama-stack llama stack build --distro starter\n",
|
"!uv run --with llama-stack llama stack list-deps starter | xargs -L1 uv pip install\n",
|
||||||
"\n",
|
"\n",
|
||||||
"def run_llama_stack_server_background():\n",
|
"def run_llama_stack_server_background():\n",
|
||||||
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
||||||
|
|
|
||||||
|
|
@ -47,11 +47,11 @@ function QuickStart() {
|
||||||
<pre><code>{`# Install uv and start Ollama
|
<pre><code>{`# Install uv and start Ollama
|
||||||
ollama run llama3.2:3b --keepalive 60m
|
ollama run llama3.2:3b --keepalive 60m
|
||||||
|
|
||||||
|
# Install server dependencies
|
||||||
|
uv run --with llama-stack llama stack list-deps starter | xargs -L1 uv pip install
|
||||||
|
|
||||||
# Run Llama Stack server
|
# Run Llama Stack server
|
||||||
OLLAMA_URL=http://localhost:11434 \\
|
OLLAMA_URL=http://localhost:11434 uv run --with llama-stack llama stack run starter
|
||||||
uv run --with llama-stack \\
|
|
||||||
llama stack build --distro starter \\
|
|
||||||
--image-type venv --run
|
|
||||||
|
|
||||||
# Try the Python SDK
|
# Try the Python SDK
|
||||||
from llama_stack_client import LlamaStackClient
|
from llama_stack_client import LlamaStackClient
|
||||||
|
|
|
||||||
|
|
@ -78,17 +78,14 @@ If you're looking for more specific topics, we have a [Zero to Hero Guide](#next
|
||||||
|
|
||||||
## Build, Configure, and Run Llama Stack
|
## Build, Configure, and Run Llama Stack
|
||||||
|
|
||||||
1. **Build the Llama Stack**:
|
1. **Install dependencies**:
|
||||||
Build the Llama Stack using the `starter` template:
|
|
||||||
```bash
|
```bash
|
||||||
uv run --with llama-stack llama stack build --distro starter --image-type venv
|
llama stack list-deps starter | xargs -L1 uv pip install
|
||||||
```
|
```
|
||||||
**Expected Output:**
|
|
||||||
|
2. **Start the distribution**:
|
||||||
```bash
|
```bash
|
||||||
...
|
llama stack run starter
|
||||||
Build Successful!
|
|
||||||
You can find the newly-built template here: ~/.llama/distributions/starter/starter-run.yaml
|
|
||||||
You can run the new Llama Stack Distro via: uv run --with llama-stack llama stack run starter
|
|
||||||
```
|
```
|
||||||
|
|
||||||
3. **Set the ENV variables by exporting them to the terminal**:
|
3. **Set the ENV variables by exporting them to the terminal**:
|
||||||
|
|
|
||||||
|
|
@ -70,10 +70,10 @@ docker run \
|
||||||
|
|
||||||
### Via venv
|
### Via venv
|
||||||
|
|
||||||
Make sure you have done `uv pip install llama-stack` and have the Llama Stack CLI available.
|
Make sure you have the Llama Stack CLI available.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
llama stack build --distro {{ name }} --image-type venv
|
llama stack list-deps meta-reference-gpu | xargs -L1 uv pip install
|
||||||
INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct \
|
INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct \
|
||||||
llama stack run distributions/{{ name }}/run.yaml \
|
llama stack run distributions/{{ name }}/run.yaml \
|
||||||
--port 8321
|
--port 8321
|
||||||
|
|
|
||||||
|
|
@ -126,11 +126,11 @@ docker run \
|
||||||
|
|
||||||
### Via venv
|
### Via venv
|
||||||
|
|
||||||
If you've set up your local development environment, you can also build the image using your local virtual environment.
|
If you've set up your local development environment, you can also install the distribution dependencies using your local virtual environment.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
INFERENCE_MODEL=meta-llama/Llama-3.1-8B-Instruct
|
INFERENCE_MODEL=meta-llama/Llama-3.1-8B-Instruct
|
||||||
llama stack build --distro nvidia --image-type venv
|
llama stack list-deps nvidia | xargs -L1 uv pip install
|
||||||
NVIDIA_API_KEY=$NVIDIA_API_KEY \
|
NVIDIA_API_KEY=$NVIDIA_API_KEY \
|
||||||
INFERENCE_MODEL=$INFERENCE_MODEL \
|
INFERENCE_MODEL=$INFERENCE_MODEL \
|
||||||
llama stack run ./run.yaml \
|
llama stack run ./run.yaml \
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue