diff --git a/docs/source/getting_started/index.md b/docs/source/getting_started/index.md index 13cfd4d2f..0135c8454 100644 --- a/docs/source/getting_started/index.md +++ b/docs/source/getting_started/index.md @@ -8,13 +8,13 @@ In Llama Stack, we provide a server exposing multiple APIs. These APIs are backe Ollama is an LLM runtime that allows you to run Llama models locally. -### 1. Start Ollama +### 1. Download a Llama model with Ollama ```bash -ollama run llama3.2:3b-instruct-fp16 --keepalive 60m +ollama pull llama3.2:3b-instruct-fp16 ``` -By default, Ollama keeps the model loaded in memory for 5 minutes which can be too short. We set the `--keepalive` flag to 60 minutes to ensure the model remains loaded for sometime. +This will instruct the Ollama service to download the Llama 3.2 3B Instruct model, which we'll use in the rest of this guide. ```{admonition} Note :class: tip