Change post training run.yaml inference config (#710)

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 18:00:36 +00:00

## Context
Colab notebook provides some limited free T4 GPU. 

Making post training template e2e works with colab notebook T4 is
critical for early adoption of the stack post training apis. However, we
found that the existing LlamaModelParallelGenerator
(https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/inline/inference/meta_reference/inference.py#L82)
in meta-reference inference implementation isn't compatible with T4
machine.

In this PR, We change to disable create_distributed_process_group for
inference api in post training run.yaml config and setup up the
distributed env variables in notebook
<img width="493" alt="Screenshot 2025-01-02 at 3 48 08 PM"
src="https://github.com/user-attachments/assets/dd159f70-4cff-475c-b459-1fc6e2c720ba"
/>

to make meta reference inference compatible with the free T4 machine

 ## test
Test with the WIP post training showcase colab notebook
https://colab.research.google.com/drive/1K4Q2wZq232_Bpy2ud4zL9aRxvCWAwyQs?usp=sharing

This commit is contained in:

Botao Chen

2025-01-03 08:37:48 -08:00

• committed by

GitHub

parent e1f42eb5a5

commit f450a0fd32

No known key found for this signature in database

GPG key ID: B5690EEEBB952194

1 changed files with 1 additions and 0 deletions

									
										1

llama_stack/templates/experimental-post-training/run.yaml
									
										View file
										
				@ -19,6 +19,7 @@ providers:

				    config:

				      max_seq_len: 4096

				      checkpoint_dir: null

				      create_distributed_process_group: False

				  eval:

				  - provider_id: meta-reference

				    provider_type: inline::meta-reference

Rows
Columns

Change post training run.yaml inference config (#710)

1 llama_stack/templates/experimental-post-training/run.yaml Unescape Escape View file

1

llama_stack/templates/experimental-post-training/run.yaml

View file