mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-31 06:23:52 +00:00
Merge branch 'main' into docs-4
This commit is contained in:
commit
3366937765
168 changed files with 14921 additions and 1625 deletions
|
|
@ -12,17 +12,21 @@ as the inference [provider](../providers/index.md#inference) for a Llama Model.
|
|||
|
||||
## Step 1: Installation and Setup
|
||||
|
||||
### i. Install and Start Ollama for Inference
|
||||
### i. Install and Setup Ollama for Inference
|
||||
|
||||
Install Ollama by following the instructions on the [Ollama website](https://ollama.com/download).
|
||||
|
||||
To start Ollama run:
|
||||
Then download a Llama model with Ollama
|
||||
```bash
|
||||
ollama pull llama3.2:3b-instruct-fp16
|
||||
```
|
||||
This will instruct the Ollama service to download the Llama 3.2 3B Instruct model, which we'll use in the rest of this guide.
|
||||
|
||||
Then to start Ollama run:
|
||||
```bash
|
||||
ollama run llama3.2:3b --keepalive 60m
|
||||
```
|
||||
|
||||
By default, Ollama keeps the model loaded in memory for 5 minutes which can be too short. We set the `--keepalive` flag to 60 minutes to ensure the model remains loaded for sometime.
|
||||
|
||||
### ii. Install `uv` to Manage your Python packages
|
||||
|
||||
Install [uv](https://docs.astral.sh/uv/) to setup your virtual environment
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue