From c2195a0b5cb13943add1112d8457b68dd3718c1d Mon Sep 17 00:00:00 2001 From: Xi Yan Date: Wed, 30 Oct 2024 10:55:02 -0700 Subject: [PATCH] tab --- docs/source/getting_started/index.md | 56 ++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) diff --git a/docs/source/getting_started/index.md b/docs/source/getting_started/index.md index bab57014a..6d6e953e8 100644 --- a/docs/source/getting_started/index.md +++ b/docs/source/getting_started/index.md @@ -120,6 +120,7 @@ docker compose down :::: **Via Conda** + ::::{tab-set} :::{tab-item} meta-reference-gpu @@ -150,7 +151,62 @@ llama stack run ./gpu/run.yaml ##### 1.2 (Optional) Serving Model +::::{tab-set} +:::{tab-item} meta-reference-gpu +You may change the `config.model` in `run.yaml` to update the model currently being served by the distribution. Make sure you have the model checkpoint downloaded in your `~/.llama`. +``` +inference: + - provider_id: meta0 + provider_type: meta-reference + config: + model: Llama3.2-11B-Vision-Instruct + quantization: null + torch_seed: null + max_seq_len: 4096 + max_batch_size: 1 +``` + +Run `llama model list` to see the available models to download, and `llama model download` to download the checkpoints. +::: + +:::{tab-item} ollama +You can use ollama for managing model downloads. + +``` +ollama pull llama3.1:8b-instruct-fp16 +ollama pull llama3.1:70b-instruct-fp16 +``` + +> [!NOTE] +> Please check the [OLLAMA_SUPPORTED_MODELS](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/adapters/inference/ollama/ollama.py) for the supported Ollama models. + + +To serve a new model with `ollama` +``` +ollama run +``` + +To make sure that the model is being served correctly, run `ollama ps` to get a list of models being served by ollama. +``` +$ ollama ps + +NAME ID SIZE PROCESSOR UNTIL +llama3.1:8b-instruct-fp16 4aacac419454 17 GB 100% GPU 4 minutes from now +``` + +To verify that the model served by ollama is correctly connected to Llama Stack server +``` +$ llama-stack-client models list ++----------------------+----------------------+---------------+-----------------------------------------------+ +| identifier | llama_model | provider_id | metadata | ++======================+======================+===============+===============================================+ +| Llama3.1-8B-Instruct | Llama3.1-8B-Instruct | ollama0 | {'ollama_model': 'llama3.1:8b-instruct-fp16'} | ++----------------------+----------------------+---------------+-----------------------------------------------+ +``` +::: + +:::: ## Step 2. Build Your Llama Stack App