mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-11 13:44:38 +00:00
chore!: remove model mgmt from CLI for Hugging Face CLI (#3700)
This change removes the `llama model` and `llama download` subcommands from the CLI, replacing them with recommendations to use the Hugging Face CLI instead. Rationale for this change: - The model management functionality was largely duplicating what Hugging Face CLI already provides, leading to unnecessary maintenance overhead (except the download source from Meta?) - Maintaining our own implementation required fixing bugs and keeping up with changes in model repositories and download mechanisms - The Hugging Face CLI is more mature, widely adopted, and better maintained - This allows us to focus on the core Llama Stack functionality rather than reimplementing model management tools Changes made: - Removed all model-related CLI commands and their implementations - Updated documentation to recommend using `huggingface-cli` for model downloads - Removed Meta-specific download logic and statements - Simplified the CLI to focus solely on stack management operations Users should now use: - `huggingface-cli download` for downloading models - `huggingface-cli scan-cache` for listing downloaded models This is a breaking change as it removes previously available CLI commands. Signed-off-by: Sébastien Han <seb@redhat.com>
This commit is contained in:
parent
841d0c3583
commit
7ee0ee7843
21 changed files with 63 additions and 1612 deletions
|
@ -25,7 +25,7 @@ pip install -U llama_stack
|
|||
|
||||
MODEL="Llama-4-Scout-17B-16E-Instruct"
|
||||
# get meta url from llama.com
|
||||
llama model download --source meta --model-id $MODEL --meta-url <META_URL>
|
||||
huggingface-cli download meta-llama/$MODEL --local-dir ~/.llama/$MODEL
|
||||
|
||||
# start a llama stack server
|
||||
INFERENCE_MODEL=meta-llama/$MODEL llama stack build --run --template meta-reference-gpu
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue