mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-08-02 00:34:44 +00:00
Update README.md
This commit is contained in:
parent
2fc1c16d58
commit
b8cf988f42
1 changed files with 9 additions and 7 deletions
|
@ -39,7 +39,7 @@ If you're looking for more specific topics like tool calling or agent setup, we
|
||||||
|
|
||||||
1. **Download Ollama App**:
|
1. **Download Ollama App**:
|
||||||
- Go to [https://ollama.com/download](https://ollama.com/download).
|
- Go to [https://ollama.com/download](https://ollama.com/download).
|
||||||
- Download and unzip `Ollama-darwin.zip`.
|
- Follow instructions based on the OS you are on. For example, if you are on a Mac, download and unzip `Ollama-darwin.zip`.
|
||||||
- Run the `Ollama` application.
|
- Run the `Ollama` application.
|
||||||
|
|
||||||
1. **Download the Ollama CLI**:
|
1. **Download the Ollama CLI**:
|
||||||
|
@ -88,7 +88,7 @@ If you're looking for more specific topics like tool calling or agent setup, we
|
||||||
4. **Install Llama Stack**:
|
4. **Install Llama Stack**:
|
||||||
- Open a new terminal and install `llama-stack`:
|
- Open a new terminal and install `llama-stack`:
|
||||||
```bash
|
```bash
|
||||||
conda activate hack
|
conda activate ollama
|
||||||
pip install llama-stack==0.0.53
|
pip install llama-stack==0.0.53
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -113,7 +113,7 @@ Build Successful! Next steps:
|
||||||
2. **Set the ENV variables by exporting them to the terminal**:
|
2. **Set the ENV variables by exporting them to the terminal**:
|
||||||
```bash
|
```bash
|
||||||
export OLLAMA_URL="http://localhost:11434"
|
export OLLAMA_URL="http://localhost:11434"
|
||||||
export LLAMA_STACK_PORT=5001
|
export LLAMA_STACK_PORT=5051
|
||||||
export INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct"
|
export INFERENCE_MODEL="meta-llama/Llama-3.2-3B-Instruct"
|
||||||
export SAFETY_MODEL="meta-llama/Llama-Guard-3-1B"
|
export SAFETY_MODEL="meta-llama/Llama-Guard-3-1B"
|
||||||
```
|
```
|
||||||
|
@ -125,10 +125,10 @@ export SAFETY_MODEL="meta-llama/Llama-Guard-3-1B"
|
||||||
--port $LLAMA_STACK_PORT \
|
--port $LLAMA_STACK_PORT \
|
||||||
--env INFERENCE_MODEL=$INFERENCE_MODEL \
|
--env INFERENCE_MODEL=$INFERENCE_MODEL \
|
||||||
--env SAFETY_MODEL=$SAFETY_MODEL \
|
--env SAFETY_MODEL=$SAFETY_MODEL \
|
||||||
--env OLLAMA_URL=http://localhost:11434
|
--env OLLAMA_URL=$OLLAMA_URL
|
||||||
```
|
```
|
||||||
|
|
||||||
Note: Everytime you run a new model with `ollama run`, you will need to restart the llama stack. Otherwise it won't see the new model
|
Note: Everytime you run a new model with `ollama run`, you will need to restart the llama stack. Otherwise it won't see the new model.
|
||||||
|
|
||||||
The server will start and listen on `http://localhost:5051`.
|
The server will start and listen on `http://localhost:5051`.
|
||||||
|
|
||||||
|
@ -139,7 +139,7 @@ The server will start and listen on `http://localhost:5051`.
|
||||||
After setting up the server, open a new terminal window and verify it's working by sending a `POST` request using `curl`:
|
After setting up the server, open a new terminal window and verify it's working by sending a `POST` request using `curl`:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
curl http://localhost:5051/inference/chat_completion \
|
curl http://localhost:$LLAMA_STACK_PORT/inference/chat_completion \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-d '{
|
-d '{
|
||||||
"model": "Llama3.2-3B-Instruct",
|
"model": "Llama3.2-3B-Instruct",
|
||||||
|
@ -176,7 +176,7 @@ You can also interact with the Llama Stack server using a simple Python script.
|
||||||
The `llama-stack-client` library offers a robust and efficient python methods for interacting with the Llama Stack server.
|
The `llama-stack-client` library offers a robust and efficient python methods for interacting with the Llama Stack server.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
conda activate your-llama-stack-conda-env
|
conda activate ollama
|
||||||
```
|
```
|
||||||
|
|
||||||
Note, the client library gets installed by default if you install the server library
|
Note, the client library gets installed by default if you install the server library
|
||||||
|
@ -188,6 +188,8 @@ touch test_llama_stack.py
|
||||||
|
|
||||||
### 3. Create a Chat Completion Request in Python
|
### 3. Create a Chat Completion Request in Python
|
||||||
|
|
||||||
|
In `test_llama_stack.py`, write the following code:
|
||||||
|
|
||||||
```python
|
```python
|
||||||
from llama_stack_client import LlamaStackClient
|
from llama_stack_client import LlamaStackClient
|
||||||
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue