llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 18:00:36 +00:00

History

Botao Chen f450a0fd32 Change post training run.yaml inference config (#710 ) ## Context Colab notebook provides some limited free T4 GPU. Making post training template e2e works with colab notebook T4 is critical for early adoption of the stack post training apis. However, we found that the existing LlamaModelParallelGenerator (https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/inline/inference/meta_reference/inference.py#L82) in meta-reference inference implementation isn't compatible with T4 machine. In this PR, We change to disable create_distributed_process_group for inference api in post training run.yaml config and setup up the distributed env variables in notebook <img width="493" alt="Screenshot 2025-01-02 at 3 48 08 PM" src="https://github.com/user-attachments/assets/dd159f70-4cff-475c-b459-1fc6e2c720ba" /> to make meta reference inference compatible with the free T4 machine ## test Test with the WIP post training showcase colab notebook https://colab.research.google.com/drive/1K4Q2wZq232_Bpy2ud4zL9aRxvCWAwyQs?usp=sharing		2025-01-03 08:37:48 -08:00
..
bedrock	Dont include 3B / 1B models for bedrock since they arent ondemand	2024-12-18 06:30:02 -08:00
cerebras	Update Cerebras from Llama 3.1 to 3.3 (#645 )	2024-12-17 16:28:24 -08:00
experimental-post-training	Change post training run.yaml inference config (#710 )	2025-01-03 08:37:48 -08:00
fireworks	Add Llama 70B 3.3 to fireworks (#654 )	2024-12-19 17:32:49 -08:00
hf-endpoint	add embedding model by default to distribution templates (#617 )	2024-12-13 12:48:00 -08:00
hf-serverless	add embedding model by default to distribution templates (#617 )	2024-12-13 12:48:00 -08:00
meta-reference-gpu	add embedding model by default to distribution templates (#617 )	2024-12-13 12:48:00 -08:00
meta-reference-quantized-gpu	add embedding model by default to distribution templates (#617 )	2024-12-13 12:48:00 -08:00
ollama	add embedding model by default to distribution templates (#617 )	2024-12-13 12:48:00 -08:00
remote-vllm	add embedding model by default to distribution templates (#617 )	2024-12-13 12:48:00 -08:00
tgi	add embedding model by default to distribution templates (#617 )	2024-12-13 12:48:00 -08:00
together	add embedding model by default to distribution templates (#617 )	2024-12-13 12:48:00 -08:00
vllm-gpu	add embedding model by default to distribution templates (#617 )	2024-12-13 12:48:00 -08:00
__init__.py	Auto-generate distro yamls + docs (#468 )	2024-11-18 14:57:06 -08:00
template.py	add embedding model by default to distribution templates (#617 )	2024-12-13 12:48:00 -08:00