llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 18:00:36 +00:00

History

Botao Chen 52a21ce78f Free up memory after post training finishes (#770 ) ## context Currently, the GPU memory will be continuously occupied after the training finishes. In this PR, we explicitly delete the reference and clean up the memory after training finishes. ## test Before the change, after training a llama 3.2 3B model, >6GB GPU memory is still occupied After the change, after training a llama 3.2 3B model, the GPU memory drops to ~1GB <img width="156" alt="Screenshot 2025-01-14 at 6 05 17 PM" src="https://github.com/user-attachments/assets/45d212b1-a651-49f3-aad9-1c0a27fcebcf" />		2025-01-14 19:19:38 -08:00
..
agents	Update spec	2025-01-13 23:16:53 -08:00
datasetio	Add persistence for localfs datasets (#557 )	2025-01-09 17:34:18 -08:00
eval	move DataSchemaValidatorMixin into standalone utils (#720 )	2025-01-06 13:25:09 -08:00
inference	Update spec	2025-01-13 23:16:53 -08:00
ios/inference	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00
memory	[remove import ] clean up import 's (#689 )	2024-12-27 15:45:44 -08:00
post_training	Free up memory after post training finishes (#770 )	2025-01-14 19:19:38 -08:00
safety	Consolidating Safety tests from various places under client-sdk (#699 )	2025-01-13 17:46:24 -08:00
scoring	Add X-LlamaStack-Client-Version, rename ProviderData -> Provider-Data (#735 )	2025-01-09 11:51:36 -08:00
telemetry	Fix telemetry to work on reinstantiating new lib cli (#761 )	2025-01-14 11:31:50 -08:00
tool_runtime	agents to use tools api (#673 )	2025-01-08 19:01:00 -08:00
__init__.py	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00