llama-stack

forked from phoenix-oss/llama-stack-mirror

History

Botao Chen 52a21ce78f Free up memory after post training finishes (#770 ) ## context Currently, the GPU memory will be continuously occupied after the training finishes. In this PR, we explicitly delete the reference and clean up the memory after training finishes. ## test Before the change, after training a llama 3.2 3B model, >6GB GPU memory is still occupied After the change, after training a llama 3.2 3B model, the GPU memory drops to ~1GB <img width="156" alt="Screenshot 2025-01-14 at 6 05 17 PM" src="https://github.com/user-attachments/assets/45d212b1-a651-49f3-aad9-1c0a27fcebcf" />	2025-01-14 19:19:38 -08:00
..
__init__.py	Add init files to post training folders (#711 )	2025-01-13 20:19:18 -08:00
lora_finetuning_single_device.py	Free up memory after post training finishes (#770 )	2025-01-14 19:19:38 -08:00

Free up memory after post training finishes (#770 )

## context 
Currently, the GPU memory will be continuously occupied after the
training finishes. In this PR, we explicitly delete the reference and
clean up the memory after training finishes.

## test
Before the change, after training a llama 3.2 3B model, >6GB GPU memory
is still occupied

After the change, after training a llama 3.2 3B model, the GPU memory
drops to ~1GB

<img width="156" alt="Screenshot 2025-01-14 at 6 05 17 PM"
src="https://github.com/user-attachments/assets/45d212b1-a651-49f3-aad9-1c0a27fcebcf"
/>

2025-01-14 19:19:38 -08:00

__init__.py

Add init files to post training folders (#711 )

2025-01-13 20:19:18 -08:00

lora_finetuning_single_device.py

Free up memory after post training finishes (#770 )

2025-01-14 19:19:38 -08:00