llama-stack/llama_stack/providers/inline
Botao Chen 52a21ce78f
Free up memory after post training finishes (#770)
## context 
Currently, the GPU memory will be continuously occupied after the
training finishes. In this PR, we explicitly delete the reference and
clean up the memory after training finishes.

## test
Before the change, after training a llama 3.2 3B model, >6GB GPU memory
is still occupied

After the change, after training a llama 3.2 3B model, the GPU memory
drops to ~1GB

<img width="156" alt="Screenshot 2025-01-14 at 6 05 17 PM"
src="https://github.com/user-attachments/assets/45d212b1-a651-49f3-aad9-1c0a27fcebcf"
/>
2025-01-14 19:19:38 -08:00
..
agents Update spec 2025-01-13 23:16:53 -08:00
datasetio Add persistence for localfs datasets (#557) 2025-01-09 17:34:18 -08:00
eval move DataSchemaValidatorMixin into standalone utils (#720) 2025-01-06 13:25:09 -08:00
inference Update spec 2025-01-13 23:16:53 -08:00
ios/inference impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00
memory [remove import *] clean up import *'s (#689) 2024-12-27 15:45:44 -08:00
post_training Free up memory after post training finishes (#770) 2025-01-14 19:19:38 -08:00
safety Consolidating Safety tests from various places under client-sdk (#699) 2025-01-13 17:46:24 -08:00
scoring Add X-LlamaStack-Client-Version, rename ProviderData -> Provider-Data (#735) 2025-01-09 11:51:36 -08:00
telemetry Fix telemetry to work on reinstantiating new lib cli (#761) 2025-01-14 11:31:50 -08:00
tool_runtime agents to use tools api (#673) 2025-01-08 19:01:00 -08:00
__init__.py impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00