mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-08-12 13:00:39 +00:00
refactor: remove Conda support from Llama Stack (#2969)
# What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> This PR is responsible for removal of Conda support in Llama Stack <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #2539 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* -->
This commit is contained in:
parent
f2eee4e417
commit
a749d5f4a4
44 changed files with 159 additions and 311 deletions
|
@ -47,20 +47,20 @@ If you're looking for more specific topics, we have a [Zero to Hero Guide](#next
|
|||
|
||||
## Install Dependencies and Set Up Environment
|
||||
|
||||
1. **Create a Conda Environment**:
|
||||
Create a new Conda environment with Python 3.12:
|
||||
1. **Install uv**:
|
||||
Install [uv](https://docs.astral.sh/uv/) for managing dependencies:
|
||||
```bash
|
||||
conda create -n ollama python=3.12
|
||||
```
|
||||
Activate the environment:
|
||||
```bash
|
||||
conda activate ollama
|
||||
# macOS and Linux
|
||||
curl -LsSf https://astral.sh/uv/install.sh | sh
|
||||
|
||||
# Windows
|
||||
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
|
||||
```
|
||||
|
||||
2. **Install ChromaDB**:
|
||||
Install `chromadb` using `pip`:
|
||||
Install `chromadb` using `uv`:
|
||||
```bash
|
||||
pip install chromadb
|
||||
uv pip install chromadb
|
||||
```
|
||||
|
||||
3. **Run ChromaDB**:
|
||||
|
@ -69,28 +69,21 @@ If you're looking for more specific topics, we have a [Zero to Hero Guide](#next
|
|||
chroma run --host localhost --port 8000 --path ./my_chroma_data
|
||||
```
|
||||
|
||||
4. **Install Llama Stack**:
|
||||
Open a new terminal and install `llama-stack`:
|
||||
```bash
|
||||
conda activate ollama
|
||||
pip install -U llama-stack
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Build, Configure, and Run Llama Stack
|
||||
|
||||
1. **Build the Llama Stack**:
|
||||
Build the Llama Stack using the `ollama` template:
|
||||
Build the Llama Stack using the `starter` template:
|
||||
```bash
|
||||
llama stack build --template starter --image-type conda
|
||||
uv run --with llama-stack llama stack build --template starter --image-type venv
|
||||
```
|
||||
**Expected Output:**
|
||||
```bash
|
||||
...
|
||||
Build Successful!
|
||||
You can find the newly-built template here: ~/.llama/distributions/ollama/ollama-run.yaml
|
||||
You can run the new Llama Stack Distro via: llama stack run ~/.llama/distributions/ollama/ollama-run.yaml --image-type conda
|
||||
You can find the newly-built template here: ~/.llama/distributions/starter/starter-run.yaml
|
||||
You can run the new Llama Stack Distro via: uv run --with llama-stack llama stack run starter --image-type venv
|
||||
```
|
||||
|
||||
3. **Set the ENV variables by exporting them to the terminal**:
|
||||
|
@ -102,12 +95,13 @@ If you're looking for more specific topics, we have a [Zero to Hero Guide](#next
|
|||
```
|
||||
|
||||
3. **Run the Llama Stack**:
|
||||
Run the stack with command shared by the API from earlier:
|
||||
Run the stack using uv:
|
||||
```bash
|
||||
llama stack run ollama
|
||||
--port $LLAMA_STACK_PORT
|
||||
--env INFERENCE_MODEL=$INFERENCE_MODEL
|
||||
--env SAFETY_MODEL=$SAFETY_MODEL
|
||||
uv run --with llama-stack llama stack run starter \
|
||||
--image-type venv \
|
||||
--port $LLAMA_STACK_PORT \
|
||||
--env INFERENCE_MODEL=$INFERENCE_MODEL \
|
||||
--env SAFETY_MODEL=$SAFETY_MODEL \
|
||||
--env OLLAMA_URL=$OLLAMA_URL
|
||||
```
|
||||
Note: Every time you run a new model with `ollama run`, you will need to restart the llama stack. Otherwise it won't see the new model.
|
||||
|
@ -120,7 +114,7 @@ After setting up the server, open a new terminal window and configure the llama-
|
|||
|
||||
1. Configure the CLI to point to the llama-stack server.
|
||||
```bash
|
||||
llama-stack-client configure --endpoint http://localhost:8321
|
||||
uv run --with llama-stack-client llama-stack-client configure --endpoint http://localhost:8321
|
||||
```
|
||||
**Expected Output:**
|
||||
```bash
|
||||
|
@ -128,7 +122,7 @@ After setting up the server, open a new terminal window and configure the llama-
|
|||
```
|
||||
2. Test the CLI by running inference:
|
||||
```bash
|
||||
llama-stack-client inference chat-completion --message "Write me a 2-sentence poem about the moon"
|
||||
uv run --with llama-stack-client llama-stack-client inference chat-completion --message "Write me a 2-sentence poem about the moon"
|
||||
```
|
||||
**Expected Output:**
|
||||
```bash
|
||||
|
@ -170,7 +164,7 @@ curl http://localhost:$LLAMA_STACK_PORT/alpha/inference/chat-completion
|
|||
EOF
|
||||
```
|
||||
|
||||
You can check the available models with the command `llama-stack-client models list`.
|
||||
You can check the available models with the command `uv run --with llama-stack-client llama-stack-client models list`.
|
||||
|
||||
**Expected Output:**
|
||||
```json
|
||||
|
@ -191,18 +185,12 @@ You can check the available models with the command `llama-stack-client models l
|
|||
|
||||
You can also interact with the Llama Stack server using a simple Python script. Below is an example:
|
||||
|
||||
### 1. Activate Conda Environment
|
||||
|
||||
```bash
|
||||
conda activate ollama
|
||||
```
|
||||
|
||||
### 2. Create Python Script (`test_llama_stack.py`)
|
||||
### 1. Create Python Script (`test_llama_stack.py`)
|
||||
```bash
|
||||
touch test_llama_stack.py
|
||||
```
|
||||
|
||||
### 3. Create a Chat Completion Request in Python
|
||||
### 2. Create a Chat Completion Request in Python
|
||||
|
||||
In `test_llama_stack.py`, write the following code:
|
||||
|
||||
|
@ -233,10 +221,10 @@ response = client.inference.chat_completion(
|
|||
print(response.completion_message.content)
|
||||
```
|
||||
|
||||
### 4. Run the Python Script
|
||||
### 3. Run the Python Script
|
||||
|
||||
```bash
|
||||
python test_llama_stack.py
|
||||
uv run --with llama-stack-client python test_llama_stack.py
|
||||
```
|
||||
|
||||
**Expected Output:**
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue