llama-stack-mirror/llama_stack/distributions
Sébastien Han 7ee0ee7843
chore!: remove model mgmt from CLI for Hugging Face CLI (#3700)
This change removes the `llama model` and `llama download` subcommands
from the CLI, replacing them with recommendations to use the Hugging
Face CLI instead.

Rationale for this change:
- The model management functionality was largely duplicating what
Hugging Face CLI already provides, leading to unnecessary maintenance
overhead (except the download source from Meta?)
- Maintaining our own implementation required fixing bugs and keeping up
with changes in model repositories and download mechanisms
- The Hugging Face CLI is more mature, widely adopted, and better
maintained
- This allows us to focus on the core Llama Stack functionality rather
than reimplementing model management tools

Changes made:
- Removed all model-related CLI commands and their implementations
- Updated documentation to recommend using `huggingface-cli` for model
downloads
- Removed Meta-specific download logic and statements
- Simplified the CLI to focus solely on stack management operations

Users should now use:
- `huggingface-cli download` for downloading models
- `huggingface-cli scan-cache` for listing downloaded models

This is a breaking change as it removes previously available CLI
commands.

Signed-off-by: Sébastien Han <seb@redhat.com>
2025-10-09 16:50:33 -07:00
..
ci-tests fix(logging): disable console telemetry sink by default (#3623) 2025-09-30 14:58:05 -07:00
dell chore!: remove --env from llama stack run (#3711) 2025-10-07 20:58:15 -07:00
meta-reference-gpu chore!: remove model mgmt from CLI for Hugging Face CLI (#3700) 2025-10-09 16:50:33 -07:00
nvidia chore!: remove --env from llama stack run (#3711) 2025-10-07 20:58:15 -07:00
open-benchmark fix(logging): disable console telemetry sink by default (#3623) 2025-09-30 14:58:05 -07:00
postgres-demo chore: update postgres_demo with new config (#3045) 2025-08-06 07:48:40 -07:00
starter fix(logging): disable console telemetry sink by default (#3623) 2025-09-30 14:58:05 -07:00
starter-gpu fix(logging): disable console telemetry sink by default (#3623) 2025-09-30 14:58:05 -07:00
watsonx fix: Update watsonx.ai provider to use LiteLLM mixin and list all models (#3674) 2025-10-08 07:29:43 -04:00
__init__.py chore: rename templates to distributions (#3035) 2025-08-04 11:34:17 -07:00
template.py chore: rename templates to distributions (#3035) 2025-08-04 11:34:17 -07:00