refactor structure

This commit is contained in:
Xi Yan 2024-10-29 14:04:41 -07:00
parent 9ddc28eca7
commit 42104361a3
13 changed files with 293 additions and 562 deletions

View file

@ -92,6 +92,19 @@ llama stack run ./gpu/run.yaml
### Model Serving
#### Downloading model via Ollama
You can use ollama for managing model downloads.
```
ollama pull llama3.1:8b-instruct-fp16
ollama pull llama3.1:70b-instruct-fp16
```
> [!NOTE]
> Please check the [OLLAMA_SUPPORTED_MODELS](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/adapters/inference/ollama/ollama.py) for the supported Ollama models.
To serve a new model with `ollama`
```
ollama run <model_name>