mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-04 02:03:44 +00:00
chore: update doc (#3857)
# What does this PR do? follows https://github.com/llamastack/llama-stack/pull/3839 ## Test Plan
This commit is contained in:
parent
21772de5d3
commit
359df3a37c
25 changed files with 6380 additions and 6378 deletions
|
|
@ -51,8 +51,8 @@ device: cpu
|
|||
You can access the HuggingFace trainer via the `starter` distribution:
|
||||
|
||||
```bash
|
||||
llama stack build --distro starter --image-type venv
|
||||
llama stack run ~/.llama/distributions/starter/starter-run.yaml
|
||||
llama stack list-deps starter | xargs -L1 uv pip install
|
||||
llama stack run starter
|
||||
```
|
||||
|
||||
### Usage Example
|
||||
|
|
|
|||
|
|
@ -175,8 +175,7 @@ llama-stack-client benchmarks register \
|
|||
**1. Start the Llama Stack API Server**
|
||||
|
||||
```bash
|
||||
# Build and run a distribution (example: together)
|
||||
llama stack build --distro together --image-type venv
|
||||
llama stack list-deps together | xargs -L1 uv pip install
|
||||
llama stack run together
|
||||
```
|
||||
|
||||
|
|
@ -209,7 +208,7 @@ The playground works with any Llama Stack distribution. Popular options include:
|
|||
<TabItem value="together" label="Together AI">
|
||||
|
||||
```bash
|
||||
llama stack build --distro together --image-type venv
|
||||
llama stack list-deps together | xargs -L1 uv pip install
|
||||
llama stack run together
|
||||
```
|
||||
|
||||
|
|
@ -222,7 +221,7 @@ llama stack run together
|
|||
<TabItem value="ollama" label="Ollama (Local)">
|
||||
|
||||
```bash
|
||||
llama stack build --distro ollama --image-type venv
|
||||
llama stack list-deps ollama | xargs -L1 uv pip install
|
||||
llama stack run ollama
|
||||
```
|
||||
|
||||
|
|
@ -235,7 +234,7 @@ llama stack run ollama
|
|||
<TabItem value="meta-reference" label="Meta Reference">
|
||||
|
||||
```bash
|
||||
llama stack build --distro meta-reference --image-type venv
|
||||
llama stack list-deps meta-reference | xargs -L1 uv pip install
|
||||
llama stack run meta-reference
|
||||
```
|
||||
|
||||
|
|
|
|||
|
|
@ -20,7 +20,8 @@ RAG enables your applications to reference and recall information from external
|
|||
In one terminal, start the Llama Stack server:
|
||||
|
||||
```bash
|
||||
uv run llama stack build --distro starter --image-type venv --run
|
||||
llama stack list-deps starter | xargs -L1 uv pip install
|
||||
llama stack run starter
|
||||
```
|
||||
|
||||
### 2. Connect with OpenAI Client
|
||||
|
|
|
|||
|
|
@ -67,7 +67,7 @@ def get_base_url(self) -> str:
|
|||
|
||||
## Testing the Provider
|
||||
|
||||
Before running tests, you must have required dependencies installed. This depends on the providers or distributions you are testing. For example, if you are testing the `together` distribution, you should install dependencies via `llama stack build --distro together`.
|
||||
Before running tests, you must have required dependencies installed. This depends on the providers or distributions you are testing. For example, if you are testing the `together` distribution, install its dependencies with `llama stack list-deps together | xargs -L1 uv pip install`.
|
||||
|
||||
### 1. Integration Testing
|
||||
|
||||
|
|
|
|||
|
|
@ -12,7 +12,7 @@ This avoids the overhead of setting up a server.
|
|||
```bash
|
||||
# setup
|
||||
uv pip install llama-stack
|
||||
llama stack build --distro starter --image-type venv
|
||||
llama stack list-deps starter | xargs -L1 uv pip install
|
||||
```
|
||||
|
||||
```python
|
||||
|
|
|
|||
|
|
@ -59,7 +59,7 @@ Start a Llama Stack server on localhost. Here is an example of how you can do th
|
|||
uv venv starter --python 3.12
|
||||
source starter/bin/activate # On Windows: starter\Scripts\activate
|
||||
pip install --no-cache llama-stack==0.2.2
|
||||
llama stack build --distro starter --image-type venv
|
||||
llama stack list-deps starter | xargs -L1 uv pip install
|
||||
export FIREWORKS_API_KEY=<SOME_KEY>
|
||||
llama stack run starter --port 5050
|
||||
```
|
||||
|
|
|
|||
|
|
@ -166,10 +166,10 @@ docker run \
|
|||
|
||||
### Via venv
|
||||
|
||||
Make sure you have done `pip install llama-stack` and have the Llama Stack CLI available.
|
||||
Install the distribution dependencies before launching:
|
||||
|
||||
```bash
|
||||
llama stack build --distro dell --image-type venv
|
||||
llama stack list-deps dell | xargs -L1 uv pip install
|
||||
INFERENCE_MODEL=$INFERENCE_MODEL \
|
||||
DEH_URL=$DEH_URL \
|
||||
CHROMA_URL=$CHROMA_URL \
|
||||
|
|
|
|||
|
|
@ -81,10 +81,10 @@ docker run \
|
|||
|
||||
### Via venv
|
||||
|
||||
Make sure you have done `uv pip install llama-stack` and have the Llama Stack CLI available.
|
||||
Make sure you have the Llama Stack CLI available.
|
||||
|
||||
```bash
|
||||
llama stack build --distro meta-reference-gpu --image-type venv
|
||||
llama stack list-deps meta-reference-gpu | xargs -L1 uv pip install
|
||||
INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct \
|
||||
llama stack run distributions/meta-reference-gpu/run.yaml \
|
||||
--port 8321
|
||||
|
|
|
|||
|
|
@ -136,11 +136,11 @@ docker run \
|
|||
|
||||
### Via venv
|
||||
|
||||
If you've set up your local development environment, you can also build the image using your local virtual environment.
|
||||
If you've set up your local development environment, you can also install the distribution dependencies using your local virtual environment.
|
||||
|
||||
```bash
|
||||
INFERENCE_MODEL=meta-llama/Llama-3.1-8B-Instruct
|
||||
llama stack build --distro nvidia --image-type venv
|
||||
llama stack list-deps nvidia | xargs -L1 uv pip install
|
||||
NVIDIA_API_KEY=$NVIDIA_API_KEY \
|
||||
INFERENCE_MODEL=$INFERENCE_MODEL \
|
||||
llama stack run ./run.yaml \
|
||||
|
|
|
|||
|
|
@ -240,6 +240,6 @@ additional_pip_packages:
|
|||
- sqlalchemy[asyncio]
|
||||
```
|
||||
|
||||
No other steps are required other than `llama stack build` and `llama stack run`. The build process will use `module` to install all of the provider dependencies, retrieve the spec, etc.
|
||||
No other steps are required beyond installing dependencies with `llama stack list-deps <distro> | xargs -L1 uv pip install` and then running `llama stack run`. The CLI will use `module` to install the provider dependencies, retrieve the spec, etc.
|
||||
|
||||
The provider will now be available in Llama Stack with the type `remote::ramalama`.
|
||||
|
|
|
|||
|
|
@ -123,7 +123,8 @@
|
|||
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
|
||||
"\n",
|
||||
"# this command installs all the dependencies needed for the llama stack server with the together inference provider\n",
|
||||
"!uv run --with llama-stack llama stack build --distro together\n",
|
||||
"!uv run --with llama-stack llama stack list-deps together | xargs -L1 uv pip install\n",
|
||||
"!uv run --with llama-stack llama stack run together\n",
|
||||
"\n",
|
||||
"def run_llama_stack_server_background():\n",
|
||||
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
||||
|
|
|
|||
|
|
@ -233,7 +233,8 @@
|
|||
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
|
||||
"\n",
|
||||
"# this command installs all the dependencies needed for the llama stack server\n",
|
||||
"!uv run --with llama-stack llama stack build --distro meta-reference-gpu\n",
|
||||
"!uv run --with llama-stack llama stack list-deps meta-reference-gpu | xargs -L1 uv pip install\n",
|
||||
"!uv run --with llama-stack llama stack run meta-reference-gpu\n",
|
||||
"\n",
|
||||
"def run_llama_stack_server_background():\n",
|
||||
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
||||
|
|
|
|||
|
|
@ -223,7 +223,8 @@
|
|||
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
|
||||
"\n",
|
||||
"# this command installs all the dependencies needed for the llama stack server\n",
|
||||
"!uv run --with llama-stack llama stack build --distro llama_api\n",
|
||||
"!uv run --with llama-stack llama stack list-deps llama_api | xargs -L1 uv pip install\n",
|
||||
"!uv run --with llama-stack llama stack run llama_api\n",
|
||||
"\n",
|
||||
"def run_llama_stack_server_background():\n",
|
||||
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
||||
|
|
|
|||
|
|
@ -2864,7 +2864,7 @@
|
|||
}
|
||||
],
|
||||
"source": [
|
||||
"!llama stack build --distro experimental-post-training --image-type venv --image-name __system__"
|
||||
"!llama stack list-deps experimental-post-training | xargs -L1 uv pip install"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
|
|
|||
|
|
@ -38,7 +38,7 @@
|
|||
"source": [
|
||||
"# NBVAL_SKIP\n",
|
||||
"!pip install -U llama-stack\n",
|
||||
"!UV_SYSTEM_PYTHON=1 llama stack build --distro fireworks --image-type venv"
|
||||
"llama stack list-deps fireworks | xargs -L1 uv pip install\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
|
|
|||
|
|
@ -57,7 +57,7 @@
|
|||
"outputs": [],
|
||||
"source": [
|
||||
"# NBVAL_SKIP\n",
|
||||
"!UV_SYSTEM_PYTHON=1 llama stack build --distro together --image-type venv"
|
||||
"!uv run llama stack list-deps together | xargs -L1 uv pip install\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
|
|
|||
|
|
@ -136,7 +136,8 @@
|
|||
" \"\"\"Build and run LlamaStack server in one step using --run flag\"\"\"\n",
|
||||
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
||||
" process = subprocess.Popen(\n",
|
||||
" \"uv run --with llama-stack llama stack build --distro starter --image-type venv --run\",\n",
|
||||
" \"uv run --with llama-stack llama stack list-deps starter | xargs -L1 uv pip install\",\n",
|
||||
" \"uv run --with llama-stack llama stack run starter\",\n",
|
||||
" shell=True,\n",
|
||||
" stdout=log_file,\n",
|
||||
" stderr=log_file,\n",
|
||||
|
|
@ -172,7 +173,7 @@
|
|||
"\n",
|
||||
"def kill_llama_stack_server():\n",
|
||||
" # Kill any existing llama stack server processes using pkill command\n",
|
||||
" os.system(\"pkill -f llama_stack.core.server.server\")"
|
||||
" os.system(\"pkill -f llama_stack.core.server.server\")\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
|
|
|||
|
|
@ -105,7 +105,8 @@
|
|||
" \"\"\"Build and run LlamaStack server in one step using --run flag\"\"\"\n",
|
||||
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
||||
" process = subprocess.Popen(\n",
|
||||
" \"uv run --with llama-stack llama stack build --distro starter --image-type venv --run\",\n",
|
||||
" \"uv run --with llama-stack llama stack list-deps starter | xargs -L1 uv pip install\",\n",
|
||||
" \"uv run --with llama-stack llama stack run starter\",\n",
|
||||
" shell=True,\n",
|
||||
" stdout=log_file,\n",
|
||||
" stderr=log_file,\n",
|
||||
|
|
|
|||
|
|
@ -92,7 +92,7 @@
|
|||
"metadata": {},
|
||||
"source": [
|
||||
"```bash\n",
|
||||
"LLAMA_STACK_DIR=$(pwd) llama stack build --distro nvidia --image-type venv\n",
|
||||
"uv run --with llama-stack llama stack list-deps nvidia | xargs -L1 uv pip install\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
|
|
|
|||
|
|
@ -81,7 +81,7 @@
|
|||
"metadata": {},
|
||||
"source": [
|
||||
"```bash\n",
|
||||
"LLAMA_STACK_DIR=$(pwd) llama stack build --distro nvidia --image-type venv\n",
|
||||
"uv run --with llama-stack llama stack list-deps nvidia | xargs -L1 uv pip install\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
|
|
|
|||
|
|
@ -145,7 +145,7 @@
|
|||
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
|
||||
"\n",
|
||||
"# this command installs all the dependencies needed for the llama stack server with the ollama inference provider\n",
|
||||
"!uv run --with llama-stack llama stack build --distro starter\n",
|
||||
"!uv run --with llama-stack llama stack list-deps starter | xargs -L1 uv pip install\n",
|
||||
"\n",
|
||||
"def run_llama_stack_server_background():\n",
|
||||
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
||||
|
|
|
|||
|
|
@ -47,11 +47,11 @@ function QuickStart() {
|
|||
<pre><code>{`# Install uv and start Ollama
|
||||
ollama run llama3.2:3b --keepalive 60m
|
||||
|
||||
# Install server dependencies
|
||||
uv run --with llama-stack llama stack list-deps starter | xargs -L1 uv pip install
|
||||
|
||||
# Run Llama Stack server
|
||||
OLLAMA_URL=http://localhost:11434 \\
|
||||
uv run --with llama-stack \\
|
||||
llama stack build --distro starter \\
|
||||
--image-type venv --run
|
||||
OLLAMA_URL=http://localhost:11434 uv run --with llama-stack llama stack run starter
|
||||
|
||||
# Try the Python SDK
|
||||
from llama_stack_client import LlamaStackClient
|
||||
|
|
|
|||
|
|
@ -78,17 +78,14 @@ If you're looking for more specific topics, we have a [Zero to Hero Guide](#next
|
|||
|
||||
## Build, Configure, and Run Llama Stack
|
||||
|
||||
1. **Build the Llama Stack**:
|
||||
Build the Llama Stack using the `starter` template:
|
||||
1. **Install dependencies**:
|
||||
```bash
|
||||
uv run --with llama-stack llama stack build --distro starter --image-type venv
|
||||
llama stack list-deps starter | xargs -L1 uv pip install
|
||||
```
|
||||
**Expected Output:**
|
||||
|
||||
2. **Start the distribution**:
|
||||
```bash
|
||||
...
|
||||
Build Successful!
|
||||
You can find the newly-built template here: ~/.llama/distributions/starter/starter-run.yaml
|
||||
You can run the new Llama Stack Distro via: uv run --with llama-stack llama stack run starter
|
||||
llama stack run starter
|
||||
```
|
||||
|
||||
3. **Set the ENV variables by exporting them to the terminal**:
|
||||
|
|
|
|||
|
|
@ -70,10 +70,10 @@ docker run \
|
|||
|
||||
### Via venv
|
||||
|
||||
Make sure you have done `uv pip install llama-stack` and have the Llama Stack CLI available.
|
||||
Make sure you have the Llama Stack CLI available.
|
||||
|
||||
```bash
|
||||
llama stack build --distro {{ name }} --image-type venv
|
||||
llama stack list-deps meta-reference-gpu | xargs -L1 uv pip install
|
||||
INFERENCE_MODEL=meta-llama/Llama-3.2-3B-Instruct \
|
||||
llama stack run distributions/{{ name }}/run.yaml \
|
||||
--port 8321
|
||||
|
|
|
|||
|
|
@ -126,11 +126,11 @@ docker run \
|
|||
|
||||
### Via venv
|
||||
|
||||
If you've set up your local development environment, you can also build the image using your local virtual environment.
|
||||
If you've set up your local development environment, you can also install the distribution dependencies using your local virtual environment.
|
||||
|
||||
```bash
|
||||
INFERENCE_MODEL=meta-llama/Llama-3.1-8B-Instruct
|
||||
llama stack build --distro nvidia --image-type venv
|
||||
llama stack list-deps nvidia | xargs -L1 uv pip install
|
||||
NVIDIA_API_KEY=$NVIDIA_API_KEY \
|
||||
INFERENCE_MODEL=$INFERENCE_MODEL \
|
||||
llama stack run ./run.yaml \
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue