mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-03 18:00:36 +00:00
## context Currently, the GPU memory will be continuously occupied after the training finishes. In this PR, we explicitly delete the reference and clean up the memory after training finishes. ## test Before the change, after training a llama 3.2 3B model, >6GB GPU memory is still occupied After the change, after training a llama 3.2 3B model, the GPU memory drops to ~1GB <img width="156" alt="Screenshot 2025-01-14 at 6 05 17 PM" src="https://github.com/user-attachments/assets/45d212b1-a651-49f3-aad9-1c0a27fcebcf" /> |
||
|---|---|---|
| .. | ||
| agents | ||
| datasetio | ||
| eval | ||
| inference | ||
| ios/inference | ||
| memory | ||
| post_training | ||
| safety | ||
| scoring | ||
| telemetry | ||
| tool_runtime | ||
| __init__.py | ||