mirror of
				https://github.com/meta-llama/llama-stack.git
				synced 2025-10-24 16:57:21 +00:00 
			
		
		
		
	
		
			Some checks failed
		
		
	
	SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 2s
				
			Integration Tests / discover-tests (push) Successful in 2s
				
			Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 17s
				
			Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 19s
				
			Python Package Build Test / build (3.12) (push) Failing after 14s
				
			Test Llama Stack Build / build-custom-container-distribution (push) Failing after 14s
				
			Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 15s
				
			SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 20s
				
			Unit Tests / unit-tests (3.13) (push) Failing after 15s
				
			Test Llama Stack Build / generate-matrix (push) Successful in 16s
				
			Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 20s
				
			Test External Providers / test-external-providers (venv) (push) Failing after 17s
				
			Update ReadTheDocs / update-readthedocs (push) Failing after 15s
				
			Test Llama Stack Build / build-single-provider (push) Failing after 21s
				
			Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 18s
				
			Unit Tests / unit-tests (3.12) (push) Failing after 22s
				
			Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 25s
				
			Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 23s
				
			Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 26s
				
			Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 19s
				
			Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 28s
				
			Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 21s
				
			Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 23s
				
			Python Package Build Test / build (3.13) (push) Failing after 44s
				
			Test Llama Stack Build / build (push) Failing after 25s
				
			Integration Tests / test-matrix (push) Failing after 46s
				
			Pre-commit / pre-commit (push) Successful in 2m24s
				
			# What does this PR do? Reorganizes the Llama stack webpage into more concise index pages, introduce more of a workflow, and reduce repetition of content. New nav structure so far based on #2637 Further discussions in https://github.com/meta-llama/llama-stack/discussions/2585 **Preview:**  You can also build a full local preview locally **Feedback** Looking for feedback on page titles and general feedback on the new structure **Follow up documentation** I plan on reducing some sections and standardizing some terminology in a follow up PR. More discussions on that in https://github.com/meta-llama/llama-stack/discussions/2585
		
			
				
	
	
	
	
		
			2.9 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	
			2.9 KiB
		
	
	
	
	
	
	
	
| orphan | 
|---|
| true | 
TorchTune
TorchTune is an inline post training provider for Llama Stack. It provides a simple and efficient way to fine-tune language models using PyTorch.
Features
- Simple access through the post_training API
- Fully integrated with Llama Stack
- GPU support and single device capabilities.
- Support for LoRA
Usage
To use TorchTune in your Llama Stack project, follow these steps:
- Configure your Llama Stack project to use this provider.
- Kick off a fine-tuning job using the Llama Stack post_training API.
Setup
You can access the TorchTune trainer by writing your own yaml pointing to the provider:
post_training:
  - provider_id: torchtune
    provider_type: inline::torchtune
    config: {}
you can then build and run your own stack with this provider.
Run Training
You can access the provider and the supervised_fine_tune method via the post_training API:
import time
import uuid
from llama_stack_client.types import (
    post_training_supervised_fine_tune_params,
    algorithm_config_param,
)
def create_http_client():
    from llama_stack_client import LlamaStackClient
    return LlamaStackClient(base_url="http://localhost:8321")
client = create_http_client()
# Example Dataset
client.datasets.register(
    purpose="post-training/messages",
    source={
        "type": "uri",
        "uri": "huggingface://datasets/llamastack/simpleqa?split=train",
    },
    dataset_id="simpleqa",
)
training_config = post_training_supervised_fine_tune_params.TrainingConfig(
    data_config=post_training_supervised_fine_tune_params.TrainingConfigDataConfig(
        batch_size=32,
        data_format="instruct",
        dataset_id="simpleqa",
        shuffle=True,
    ),
    gradient_accumulation_steps=1,
    max_steps_per_epoch=0,
    max_validation_steps=1,
    n_epochs=4,
)
algorithm_config = algorithm_config_param.LoraFinetuningConfig(
    alpha=1,
    apply_lora_to_mlp=True,
    apply_lora_to_output=False,
    lora_attn_modules=["q_proj"],
    rank=1,
    type="LoRA",
)
job_uuid = f"test-job{uuid.uuid4()}"
# Example Model
training_model = "meta-llama/Llama-2-7b-hf"
start_time = time.time()
response = client.post_training.supervised_fine_tune(
    job_uuid=job_uuid,
    logger_config={},
    model=training_model,
    hyperparam_search_config={},
    training_config=training_config,
    algorithm_config=algorithm_config,
    checkpoint_dir="output",
)
print("Job: ", job_uuid)
# Wait for the job to complete!
while True:
    status = client.post_training.job.status(job_uuid=job_uuid)
    if not status:
        print("Job not found")
        break
    print(status)
    if status.status == "completed":
        break
    print("Waiting for job to complete...")
    time.sleep(5)
end_time = time.time()
print("Job completed in", end_time - start_time, "seconds!")
print("Artifacts:")
print(client.post_training.job.artifacts(job_uuid=job_uuid))