mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-16 06:53:47 +00:00
refactor structure
This commit is contained in:
parent
9ddc28eca7
commit
42104361a3
13 changed files with 293 additions and 562 deletions
|
@ -92,6 +92,19 @@ llama stack run ./gpu/run.yaml
|
|||
|
||||
### Model Serving
|
||||
|
||||
#### Downloading model via Ollama
|
||||
|
||||
You can use ollama for managing model downloads.
|
||||
|
||||
```
|
||||
ollama pull llama3.1:8b-instruct-fp16
|
||||
ollama pull llama3.1:70b-instruct-fp16
|
||||
```
|
||||
|
||||
> [!NOTE]
|
||||
> Please check the [OLLAMA_SUPPORTED_MODELS](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/adapters/inference/ollama/ollama.py) for the supported Ollama models.
|
||||
|
||||
|
||||
To serve a new model with `ollama`
|
||||
```
|
||||
ollama run <model_name>
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue