mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-07-29 15:23:51 +00:00
feat(starter)!: simplify starter distro; litellm model registry changes (#2916)
This commit is contained in:
parent
3344d8a9e5
commit
9583f468f8
64 changed files with 2027 additions and 4092 deletions
|
@ -59,7 +59,7 @@ Now let's build and run the Llama Stack config for Ollama.
|
|||
We use `starter` as template. By default all providers are disabled, this requires enable ollama by passing environment variables.
|
||||
|
||||
```bash
|
||||
ENABLE_OLLAMA=ollama OLLAMA_INFERENCE_MODEL="llama3.2:3b" llama stack build --template starter --image-type venv --run
|
||||
llama stack build --template starter --image-type venv --run
|
||||
```
|
||||
:::
|
||||
:::{tab-item} Using `conda`
|
||||
|
@ -70,7 +70,7 @@ which defines the providers and their settings.
|
|||
Now let's build and run the Llama Stack config for Ollama.
|
||||
|
||||
```bash
|
||||
ENABLE_OLLAMA=ollama INFERENCE_MODEL="llama3.2:3b" llama stack build --template starter --image-type conda --run
|
||||
llama stack build --template starter --image-type conda --run
|
||||
```
|
||||
:::
|
||||
:::{tab-item} Using a Container
|
||||
|
@ -80,8 +80,6 @@ component that works with different inference providers out of the box. For this
|
|||
configurations, please check out [this guide](../distributions/building_distro.md).
|
||||
First lets setup some environment variables and create a local directory to mount into the container’s file system.
|
||||
```bash
|
||||
export INFERENCE_MODEL="llama3.2:3b"
|
||||
export ENABLE_OLLAMA=ollama
|
||||
export LLAMA_STACK_PORT=8321
|
||||
mkdir -p ~/.llama
|
||||
```
|
||||
|
@ -94,7 +92,6 @@ docker run -it \
|
|||
-v ~/.llama:/root/.llama \
|
||||
llamastack/distribution-starter \
|
||||
--port $LLAMA_STACK_PORT \
|
||||
--env INFERENCE_MODEL=$INFERENCE_MODEL \
|
||||
--env OLLAMA_URL=http://host.docker.internal:11434
|
||||
```
|
||||
Note to start the container with Podman, you can do the same but replace `docker` at the start of the command with
|
||||
|
@ -116,7 +113,6 @@ docker run -it \
|
|||
--network=host \
|
||||
llamastack/distribution-starter \
|
||||
--port $LLAMA_STACK_PORT \
|
||||
--env INFERENCE_MODEL=$INFERENCE_MODEL \
|
||||
--env OLLAMA_URL=http://localhost:11434
|
||||
```
|
||||
:::
|
||||
|
|
|
@ -19,7 +19,7 @@ ollama run llama3.2:3b --keepalive 60m
|
|||
#### Step 2: Run the Llama Stack server
|
||||
We will use `uv` to run the Llama Stack server.
|
||||
```bash
|
||||
ENABLE_OLLAMA=ollama OLLAMA_INFERENCE_MODEL=llama3.2:3b uv run --with llama-stack llama stack build --template starter --image-type venv --run
|
||||
uv run --with llama-stack llama stack build --template starter --image-type venv --run
|
||||
```
|
||||
#### Step 3: Run the demo
|
||||
Now open up a new terminal and copy the following script into a file named `demo_script.py`.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue