mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-11 05:38:38 +00:00
docs: provider and distro codegen migration (#3531)
# What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> - Updates provider and distro codegen to handle the new format - Migrates provider and distro files to the new format ## Test Plan - Manual testing <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* -->
This commit is contained in:
parent
45da31801c
commit
d23865757f
103 changed files with 1796 additions and 423 deletions
17
docs/docs/providers/post_training/index.mdx
Normal file
17
docs/docs/providers/post_training/index.mdx
Normal file
|
@ -0,0 +1,17 @@
|
|||
---
|
||||
sidebar_label: Post Training
|
||||
title: Post_Training
|
||||
---
|
||||
|
||||
# Post_Training
|
||||
|
||||
## Overview
|
||||
|
||||
This section contains documentation for all available providers for the **post_training** API.
|
||||
|
||||
## Providers
|
||||
|
||||
- [Huggingface-Gpu](./inline_huggingface-gpu)
|
||||
- [Torchtune-Cpu](./inline_torchtune-cpu)
|
||||
- [Torchtune-Gpu](./inline_torchtune-gpu)
|
||||
- [Remote - Nvidia](./remote_nvidia)
|
40
docs/docs/providers/post_training/inline_huggingface-cpu.mdx
Normal file
40
docs/docs/providers/post_training/inline_huggingface-cpu.mdx
Normal file
|
@ -0,0 +1,40 @@
|
|||
# inline::huggingface-cpu
|
||||
|
||||
## Description
|
||||
|
||||
HuggingFace-based post-training provider for fine-tuning models using the HuggingFace ecosystem.
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `device` | `<class 'str'>` | No | cuda | |
|
||||
| `distributed_backend` | `Literal['fsdp', 'deepspeed'` | No | | |
|
||||
| `checkpoint_format` | `Literal['full_state', 'huggingface'` | No | huggingface | |
|
||||
| `chat_template` | `<class 'str'>` | No | <|user|>
|
||||
{input}
|
||||
<|assistant|>
|
||||
{output} | |
|
||||
| `model_specific_config` | `<class 'dict'>` | No | {'trust_remote_code': True, 'attn_implementation': 'sdpa'} | |
|
||||
| `max_seq_length` | `<class 'int'>` | No | 2048 | |
|
||||
| `gradient_checkpointing` | `<class 'bool'>` | No | False | |
|
||||
| `save_total_limit` | `<class 'int'>` | No | 3 | |
|
||||
| `logging_steps` | `<class 'int'>` | No | 10 | |
|
||||
| `warmup_ratio` | `<class 'float'>` | No | 0.1 | |
|
||||
| `weight_decay` | `<class 'float'>` | No | 0.01 | |
|
||||
| `dataloader_num_workers` | `<class 'int'>` | No | 4 | |
|
||||
| `dataloader_pin_memory` | `<class 'bool'>` | No | True | |
|
||||
| `dpo_beta` | `<class 'float'>` | No | 0.1 | |
|
||||
| `use_reference_model` | `<class 'bool'>` | No | True | |
|
||||
| `dpo_loss_type` | `Literal['sigmoid', 'hinge', 'ipo', 'kto_pair'` | No | sigmoid | |
|
||||
| `dpo_output_dir` | `<class 'str'>` | No | | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
checkpoint_format: huggingface
|
||||
distributed_backend: null
|
||||
device: cpu
|
||||
dpo_output_dir: ~/.llama/dummy/dpo_output
|
||||
|
||||
```
|
42
docs/docs/providers/post_training/inline_huggingface-gpu.mdx
Normal file
42
docs/docs/providers/post_training/inline_huggingface-gpu.mdx
Normal file
|
@ -0,0 +1,42 @@
|
|||
---
|
||||
description: "HuggingFace-based post-training provider for fine-tuning models using the HuggingFace ecosystem."
|
||||
sidebar_label: Huggingface-Gpu
|
||||
title: inline::huggingface-gpu
|
||||
---
|
||||
|
||||
# inline::huggingface-gpu
|
||||
|
||||
## Description
|
||||
|
||||
HuggingFace-based post-training provider for fine-tuning models using the HuggingFace ecosystem.
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `device` | `<class 'str'>` | No | cuda | |
|
||||
| `distributed_backend` | `Literal['fsdp', 'deepspeed'` | No | | |
|
||||
| `checkpoint_format` | `Literal['full_state', 'huggingface'` | No | huggingface | |
|
||||
| `chat_template` | `<class 'str'>` | No | <|user|><br/>{input}<br/><|assistant|><br/>{output} | |
|
||||
| `model_specific_config` | `<class 'dict'>` | No | {'trust_remote_code': True, 'attn_implementation': 'sdpa'} | |
|
||||
| `max_seq_length` | `<class 'int'>` | No | 2048 | |
|
||||
| `gradient_checkpointing` | `<class 'bool'>` | No | False | |
|
||||
| `save_total_limit` | `<class 'int'>` | No | 3 | |
|
||||
| `logging_steps` | `<class 'int'>` | No | 10 | |
|
||||
| `warmup_ratio` | `<class 'float'>` | No | 0.1 | |
|
||||
| `weight_decay` | `<class 'float'>` | No | 0.01 | |
|
||||
| `dataloader_num_workers` | `<class 'int'>` | No | 4 | |
|
||||
| `dataloader_pin_memory` | `<class 'bool'>` | No | True | |
|
||||
| `dpo_beta` | `<class 'float'>` | No | 0.1 | |
|
||||
| `use_reference_model` | `<class 'bool'>` | No | True | |
|
||||
| `dpo_loss_type` | `Literal['sigmoid', 'hinge', 'ipo', 'kto_pair'` | No | sigmoid | |
|
||||
| `dpo_output_dir` | `<class 'str'>` | No | | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
checkpoint_format: huggingface
|
||||
distributed_backend: null
|
||||
device: cpu
|
||||
dpo_output_dir: ~/.llama/dummy/dpo_output
|
||||
```
|
40
docs/docs/providers/post_training/inline_huggingface.mdx
Normal file
40
docs/docs/providers/post_training/inline_huggingface.mdx
Normal file
|
@ -0,0 +1,40 @@
|
|||
# inline::huggingface
|
||||
|
||||
## Description
|
||||
|
||||
HuggingFace-based post-training provider for fine-tuning models using the HuggingFace ecosystem.
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `device` | `<class 'str'>` | No | cuda | |
|
||||
| `distributed_backend` | `Literal['fsdp', 'deepspeed'` | No | | |
|
||||
| `checkpoint_format` | `Literal['full_state', 'huggingface'` | No | huggingface | |
|
||||
| `chat_template` | `<class 'str'>` | No | <|user|>
|
||||
{input}
|
||||
<|assistant|>
|
||||
{output} | |
|
||||
| `model_specific_config` | `<class 'dict'>` | No | {'trust_remote_code': True, 'attn_implementation': 'sdpa'} | |
|
||||
| `max_seq_length` | `<class 'int'>` | No | 2048 | |
|
||||
| `gradient_checkpointing` | `<class 'bool'>` | No | False | |
|
||||
| `save_total_limit` | `<class 'int'>` | No | 3 | |
|
||||
| `logging_steps` | `<class 'int'>` | No | 10 | |
|
||||
| `warmup_ratio` | `<class 'float'>` | No | 0.1 | |
|
||||
| `weight_decay` | `<class 'float'>` | No | 0.01 | |
|
||||
| `dataloader_num_workers` | `<class 'int'>` | No | 4 | |
|
||||
| `dataloader_pin_memory` | `<class 'bool'>` | No | True | |
|
||||
| `dpo_beta` | `<class 'float'>` | No | 0.1 | |
|
||||
| `use_reference_model` | `<class 'bool'>` | No | True | |
|
||||
| `dpo_loss_type` | `Literal['sigmoid', 'hinge', 'ipo', 'kto_pair'` | No | sigmoid | |
|
||||
| `dpo_output_dir` | `<class 'str'>` | No | | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
checkpoint_format: huggingface
|
||||
distributed_backend: null
|
||||
device: cpu
|
||||
dpo_output_dir: ~/.llama/dummy/dpo_output
|
||||
|
||||
```
|
24
docs/docs/providers/post_training/inline_torchtune-cpu.mdx
Normal file
24
docs/docs/providers/post_training/inline_torchtune-cpu.mdx
Normal file
|
@ -0,0 +1,24 @@
|
|||
---
|
||||
description: "TorchTune-based post-training provider for fine-tuning and optimizing models using Meta's TorchTune framework."
|
||||
sidebar_label: Torchtune-Cpu
|
||||
title: inline::torchtune-cpu
|
||||
---
|
||||
|
||||
# inline::torchtune-cpu
|
||||
|
||||
## Description
|
||||
|
||||
TorchTune-based post-training provider for fine-tuning and optimizing models using Meta's TorchTune framework.
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `torch_seed` | `int \| None` | No | | |
|
||||
| `checkpoint_format` | `Literal['meta', 'huggingface'` | No | meta | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
checkpoint_format: meta
|
||||
```
|
24
docs/docs/providers/post_training/inline_torchtune-gpu.mdx
Normal file
24
docs/docs/providers/post_training/inline_torchtune-gpu.mdx
Normal file
|
@ -0,0 +1,24 @@
|
|||
---
|
||||
description: "TorchTune-based post-training provider for fine-tuning and optimizing models using Meta's TorchTune framework."
|
||||
sidebar_label: Torchtune-Gpu
|
||||
title: inline::torchtune-gpu
|
||||
---
|
||||
|
||||
# inline::torchtune-gpu
|
||||
|
||||
## Description
|
||||
|
||||
TorchTune-based post-training provider for fine-tuning and optimizing models using Meta's TorchTune framework.
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `torch_seed` | `int \| None` | No | | |
|
||||
| `checkpoint_format` | `Literal['meta', 'huggingface'` | No | meta | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
checkpoint_format: meta
|
||||
```
|
20
docs/docs/providers/post_training/inline_torchtune.md
Normal file
20
docs/docs/providers/post_training/inline_torchtune.md
Normal file
|
@ -0,0 +1,20 @@
|
|||
# inline::torchtune
|
||||
|
||||
## Description
|
||||
|
||||
TorchTune-based post-training provider for fine-tuning and optimizing models using Meta's TorchTune framework.
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `torch_seed` | `int \| None` | No | | |
|
||||
| `checkpoint_format` | `Literal['meta', 'huggingface'` | No | meta | |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
checkpoint_format: meta
|
||||
|
||||
```
|
||||
|
32
docs/docs/providers/post_training/remote_nvidia.mdx
Normal file
32
docs/docs/providers/post_training/remote_nvidia.mdx
Normal file
|
@ -0,0 +1,32 @@
|
|||
---
|
||||
description: "NVIDIA's post-training provider for fine-tuning models on NVIDIA's platform."
|
||||
sidebar_label: Remote - Nvidia
|
||||
title: remote::nvidia
|
||||
---
|
||||
|
||||
# remote::nvidia
|
||||
|
||||
## Description
|
||||
|
||||
NVIDIA's post-training provider for fine-tuning models on NVIDIA's platform.
|
||||
|
||||
## Configuration
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `api_key` | `str \| None` | No | | The NVIDIA API key. |
|
||||
| `dataset_namespace` | `str \| None` | No | default | The NVIDIA dataset namespace. |
|
||||
| `project_id` | `str \| None` | No | test-example-model@v1 | The NVIDIA project ID. |
|
||||
| `customizer_url` | `str \| None` | No | | Base URL for the NeMo Customizer API |
|
||||
| `timeout` | `<class 'int'>` | No | 300 | Timeout for the NVIDIA Post Training API |
|
||||
| `max_retries` | `<class 'int'>` | No | 3 | Maximum number of retries for the NVIDIA Post Training API |
|
||||
| `output_model_dir` | `<class 'str'>` | No | test-example-model@v1 | Directory to save the output model |
|
||||
|
||||
## Sample Configuration
|
||||
|
||||
```yaml
|
||||
api_key: ${env.NVIDIA_API_KEY:=}
|
||||
dataset_namespace: ${env.NVIDIA_DATASET_NAMESPACE:=default}
|
||||
project_id: ${env.NVIDIA_PROJECT_ID:=test-project}
|
||||
customizer_url: ${env.NVIDIA_CUSTOMIZER_URL:=http://nemo.test}
|
||||
```
|
Loading…
Add table
Add a link
Reference in a new issue