forked from phoenix-oss/llama-stack-mirror
fix: update getting started guide to use ollama pull
(#1855)
# What does this PR do? download the getting started w/ ollama model instead of downloading and running it. directly running it was necessary before https://github.com/meta-llama/llama-stack/pull/1854 ## Test Plan run the code on the page
This commit is contained in:
parent
3a9be58523
commit
a2cf299906
1 changed files with 3 additions and 3 deletions
|
@ -6,13 +6,13 @@ Llama Stack is a stateful service with REST APIs to support seamless transition
|
|||
In this guide, we'll walk through how to build a RAG agent locally using Llama Stack with [Ollama](https://ollama.com/) to run inference on a Llama Model.
|
||||
|
||||
|
||||
### 1. Start Ollama
|
||||
### 1. Download a Llama model with Ollama
|
||||
|
||||
```bash
|
||||
ollama run llama3.2:3b --keepalive 60m
|
||||
ollama pull llama3.2:3b-instruct-fp16
|
||||
```
|
||||
|
||||
By default, Ollama keeps the model loaded in memory for 5 minutes which can be too short. We set the `--keepalive` flag to 60 minutes to ensure the model remains loaded for sometime.
|
||||
This will instruct the Ollama service to download the Llama 3.2 3B Instruct model, which we'll use in the rest of this guide.
|
||||
|
||||
```{admonition} Note
|
||||
:class: tip
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue