mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-10 21:34:36 +00:00
fix: update dangling references to llama download command (#3763)
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
Test Llama Stack Build / build-single-provider (push) Failing after 3s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s
Python Package Build Test / build (3.13) (push) Failing after 1s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s
Vector IO Integration Tests / test-matrix (push) Failing after 5s
Python Package Build Test / build (3.12) (push) Failing after 3s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Test Llama Stack Build / build (push) Failing after 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 10s
Unit Tests / unit-tests (3.13) (push) Failing after 3s
Unit Tests / unit-tests (3.12) (push) Failing after 5s
UI Tests / ui-tests (22) (push) Successful in 40s
Pre-commit / pre-commit (push) Successful in 2m14s
Some checks failed
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 0s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s
Test Llama Stack Build / generate-matrix (push) Successful in 3s
Test Llama Stack Build / build-single-provider (push) Failing after 3s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 3s
Python Package Build Test / build (3.13) (push) Failing after 1s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 3s
Vector IO Integration Tests / test-matrix (push) Failing after 5s
Python Package Build Test / build (3.12) (push) Failing after 3s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Test Llama Stack Build / build (push) Failing after 3s
API Conformance Tests / check-schema-compatibility (push) Successful in 10s
Unit Tests / unit-tests (3.13) (push) Failing after 3s
Unit Tests / unit-tests (3.12) (push) Failing after 5s
UI Tests / ui-tests (22) (push) Successful in 40s
Pre-commit / pre-commit (push) Successful in 2m14s
## Summary After removing model management CLI in #3700, this PR updates remaining references to the old `llama download` command to use `huggingface-cli download` instead. ## Changes - Updated error messages in `meta_reference/common.py` to recommend `huggingface-cli download` - Updated error messages in `torchtune/recipes/lora_finetuning_single_device.py` to use `huggingface-cli download` - Updated post-training notebook to use `huggingface-cli download` instead of `llama download` - Fixed typo: "you model" -> "your model" ## Test Plan - Verified error messages provide correct guidance for users - Checked that notebook instructions are up-to-date with current tooling
This commit is contained in:
parent
8fe4a216b5
commit
ebae0385bb
3 changed files with 6369 additions and 6410 deletions
|
@ -4236,24 +4236,7 @@
|
||||||
"metadata": {
|
"metadata": {
|
||||||
"id": "RWa220T5sjbR"
|
"id": "RWa220T5sjbR"
|
||||||
},
|
},
|
||||||
"source": [
|
"source": "# 2. Start Post Training\nCurrently, Llama stack post training APIs support [Supervised Fine-tune](https://cameronrwolfe.substack.com/p/understanding-and-using-supervised) which is a straightforward and effective way to boost model performance on specific tasks.\n\nWe start from [LoRA finetune algorithm](https://pytorch.org/torchtune/main/tutorials/lora_finetune.html#what-is-lora) that can significantly reduce finetune GPU memory usage as well as needs less data\n\n\n#### 2.0. Download the base model\nDownload the Llama model using the [Hugging Face CLI](https://huggingface.co/docs/huggingface_hub/guides/cli).\n\nSince ollama takes huggingface safetensor format checkpoint, we need to output the finetuned checkpoint in hugging face format. We download the model checkpoint from huggingface source.\n\n> You need to authenticate with Hugging Face by getting your token from [here](https://huggingface.co/settings/tokens) and running `huggingface-cli login`"
|
||||||
"# 2. Start Post Training\n",
|
|
||||||
"Currenty, Llama stack post training APIs support [Supervised Fine-tune](https://cameronrwolfe.substack.com/p/understanding-and-using-supervised) which is a straightfoard and effective way to boost model performance on specific tasks.\n",
|
|
||||||
"\n",
|
|
||||||
"We start from [LoRA finetune algorithm](https://pytorch.org/torchtune/main/tutorials/lora_finetune.html#what-is-lora) that can significantly reduce finetune GPU memory usage as well as needs less data\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"#### 2.0. Download the base model\n",
|
|
||||||
"Download the Llama model that will be used with [the downloading model CLI](https://llama-stack.readthedocs.io/en/latest/references/llama_cli_reference/download_models.html).\n",
|
|
||||||
"\n",
|
|
||||||
"Since ollama takes huggingface safetensor format checkpoint, we need to output the finetuned checkpoint in hugging face format. We download the model checkpoint from huggingface source.\n",
|
|
||||||
"\n",
|
|
||||||
"> You need to get a huggingface token from [here](https://huggingface.co/) and replace the \"HF_TOKEN\"\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"\n",
|
|
||||||
"\n"
|
|
||||||
]
|
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
|
@ -4266,33 +4249,8 @@
|
||||||
"id": "yF50MtwcsogU",
|
"id": "yF50MtwcsogU",
|
||||||
"outputId": "92ba3b3a-63a0-4ab8-c8cd-5437365128fc"
|
"outputId": "92ba3b3a-63a0-4ab8-c8cd-5437365128fc"
|
||||||
},
|
},
|
||||||
"outputs": [
|
"outputs": [],
|
||||||
{
|
"source": "!huggingface-cli download meta-llama/Llama-3.2-3B-Instruct --local-dir ~/.llama/Llama-3.2-3B-Instruct"
|
||||||
"name": "stdout",
|
|
||||||
"output_type": "stream",
|
|
||||||
"text": [
|
|
||||||
".gitattributes: 100% 1.52k/1.52k [00:00<00:00, 12.1MB/s]\n",
|
|
||||||
"LICENSE.txt: 100% 7.71k/7.71k [00:00<00:00, 33.3MB/s]\n",
|
|
||||||
"README.md: 100% 41.7k/41.7k [00:00<00:00, 56.9MB/s]\n",
|
|
||||||
"USE_POLICY.md: 100% 6.02k/6.02k [00:00<00:00, 32.4MB/s]\n",
|
|
||||||
"config.json: 100% 878/878 [00:00<00:00, 6.94MB/s]\n",
|
|
||||||
"generation_config.json: 100% 189/189 [00:00<00:00, 1.71MB/s]\n",
|
|
||||||
"model.safetensors.index.json: 100% 20.9k/20.9k [00:00<00:00, 87.0MB/s]\n",
|
|
||||||
"consolidated.00.pth: 100% 6.43G/6.43G [00:18<00:00, 353MB/s]\n",
|
|
||||||
"original%2Forig_params.json: 100% 220/220 [00:00<00:00, 1.69MB/s]\n",
|
|
||||||
"original%2Fparams.json: 100% 220/220 [00:00<00:00, 1.64MB/s]\n",
|
|
||||||
"tokenizer.model: 100% 2.18M/2.18M [00:00<00:00, 44.8MB/s]\n",
|
|
||||||
"special_tokens_map.json: 100% 296/296 [00:00<00:00, 2.69MB/s]\n",
|
|
||||||
"tokenizer.json: 100% 9.09M/9.09M [00:01<00:00, 8.57MB/s]\n",
|
|
||||||
"tokenizer_config.json: 100% 54.5k/54.5k [00:00<00:00, 172MB/s]\n",
|
|
||||||
"\n",
|
|
||||||
"Successfully downloaded model to /root/.llama/checkpoints/Llama3.2-3B-Instruct\n"
|
|
||||||
]
|
|
||||||
}
|
|
||||||
],
|
|
||||||
"source": [
|
|
||||||
"!llama download --source huggingface --model-id Llama3.2-3B-Instruct --hf-token \"HF_TOKEN\""
|
|
||||||
]
|
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
|
|
|
@ -18,7 +18,7 @@ def model_checkpoint_dir(model_id) -> str:
|
||||||
|
|
||||||
assert checkpoint_dir.exists(), (
|
assert checkpoint_dir.exists(), (
|
||||||
f"Could not find checkpoints in: {model_local_dir(model_id)}. "
|
f"Could not find checkpoints in: {model_local_dir(model_id)}. "
|
||||||
f"If you try to use the native llama model, Please download model using `llama download --model-id {model_id}`"
|
f"If you try to use the native llama model, please download the model using `llama-model download --source meta --model-id {model_id}` (see https://github.com/meta-llama/llama-models). "
|
||||||
f"Otherwise, please save you model checkpoint under {model_local_dir(model_id)}"
|
f"Otherwise, please save your model checkpoint under {model_local_dir(model_id)}"
|
||||||
)
|
)
|
||||||
return str(checkpoint_dir)
|
return str(checkpoint_dir)
|
||||||
|
|
|
@ -104,9 +104,10 @@ class LoraFinetuningSingleDevice:
|
||||||
if not any(p.exists() for p in paths):
|
if not any(p.exists() for p in paths):
|
||||||
checkpoint_dir = checkpoint_dir / "original"
|
checkpoint_dir = checkpoint_dir / "original"
|
||||||
|
|
||||||
|
hf_repo = model.huggingface_repo or f"meta-llama/{model.descriptor()}"
|
||||||
assert checkpoint_dir.exists(), (
|
assert checkpoint_dir.exists(), (
|
||||||
f"Could not find checkpoints in: {model_local_dir(model.descriptor())}. "
|
f"Could not find checkpoints in: {model_local_dir(model.descriptor())}. "
|
||||||
f"Please download model using `llama download --model-id {model.descriptor()}`"
|
f"Please download the model using `huggingface-cli download {hf_repo} --local-dir ~/.llama/{model.descriptor()}`"
|
||||||
)
|
)
|
||||||
return str(checkpoint_dir)
|
return str(checkpoint_dir)
|
||||||
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue