mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-08-01 16:24:44 +00:00
update quick start to have the working instruction
remove the note for the order update the link to the ollama supported model
This commit is contained in:
parent
f1b9578f8d
commit
6bf6c79bd6
1 changed files with 16 additions and 4 deletions
|
@ -22,14 +22,22 @@ If you're looking for more specific topics like tool calling or agent setup, we
|
||||||
- Download and unzip `Ollama-darwin.zip`.
|
- Download and unzip `Ollama-darwin.zip`.
|
||||||
- Run the `Ollama` application.
|
- Run the `Ollama` application.
|
||||||
|
|
||||||
2. **Download the Ollama CLI**:
|
1. **Download the Ollama CLI**:
|
||||||
- Ensure you have the `ollama` command line tool by downloading and installing it from the same website.
|
- Ensure you have the `ollama` command line tool by downloading and installing it from the same website.
|
||||||
|
|
||||||
3. **Verify Installation**:
|
1. **Start ollama server**:
|
||||||
|
- Open the terminal and run:
|
||||||
|
```
|
||||||
|
ollama serve
|
||||||
|
```
|
||||||
|
|
||||||
|
1. **Run the model**:
|
||||||
- Open the terminal and run:
|
- Open the terminal and run:
|
||||||
```bash
|
```bash
|
||||||
ollama run llama3.2:1b
|
ollama run llama3.2:3b-instruct-fp16
|
||||||
```
|
```
|
||||||
|
**Note**: The supported models for llama stack for now is listed in [here](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/inference/ollama/ollama.py#L43)
|
||||||
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
@ -84,6 +92,8 @@ If you're looking for more specific topics like tool calling or agent setup, we
|
||||||
```bash
|
```bash
|
||||||
llama stack run /path/to/your/distro/llamastack-ollama/ollama-run.yaml --port 5050
|
llama stack run /path/to/your/distro/llamastack-ollama/ollama-run.yaml --port 5050
|
||||||
```
|
```
|
||||||
|
Note:
|
||||||
|
1. Everytime you run a new model with `ollama run`, you will need to restart the llama stack. Otherwise it won't see the new model
|
||||||
|
|
||||||
The server will start and listen on `http://localhost:5050`.
|
The server will start and listen on `http://localhost:5050`.
|
||||||
|
|
||||||
|
@ -97,7 +107,7 @@ After setting up the server, open a new terminal window and verify it's working
|
||||||
curl http://localhost:5050/inference/chat_completion \
|
curl http://localhost:5050/inference/chat_completion \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
-d '{
|
-d '{
|
||||||
"model": "llama3.2:1b",
|
"model": "Llama3.2-3B-Instruct",
|
||||||
"messages": [
|
"messages": [
|
||||||
{"role": "system", "content": "You are a helpful assistant."},
|
{"role": "system", "content": "You are a helpful assistant."},
|
||||||
{"role": "user", "content": "Write me a 2-sentence poem about the moon"}
|
{"role": "user", "content": "Write me a 2-sentence poem about the moon"}
|
||||||
|
@ -106,6 +116,8 @@ curl http://localhost:5050/inference/chat_completion \
|
||||||
}'
|
}'
|
||||||
```
|
```
|
||||||
|
|
||||||
|
You can check the available models with the command `llama-stack-client models list`.
|
||||||
|
|
||||||
**Expected Output:**
|
**Expected Output:**
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue