llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-07-08 14:54:35 +00:00

History

Ihar Hrachyshka 2250ab7274 fix: don't attempt to clean gpu memory up when device is cpu (#1191 ) This is a follow up to: https://github.com/meta-llama/llama-stack/pull/1140 Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com> # What does this PR do? [Provide a short summary of what this PR does and why. Link to relevant issues if applicable.] Avoid unnecessary GPU memory clean attempt when the GPU is not used for training. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan With CPU: ``` INFO 2025-02-26 16:43:56,267 torchtune.utils._logging:121: Model checkpoint of size 6.43 GB saved to /Users/ihrachys/.llama/checkpoints/meta-llama/Llama-3.2-3B-Instruct-sft-0/consolidated.00.pth INFO 2025-02-26 16:43:56,274 torchtune.utils._logging:132: Adapter checkpoint of size 0.00 GB saved to /Users/ihrachys/.llama/checkpoints/meta-llama/Llama-3.2-3B-Instruct-sft-0/adapter/adapter.pth model_file_path /Users/ihrachys/.llama/checkpoints/meta-llama/Llama-3.2-3B-Instruct-sft-0 ``` With CUDA: ``` INFO 2025-02-26 21:39:24,314 torchtune.utils._logging:121: Model checkpoint of size 6.43 GB saved to /home/ec2-user/.llama/checkpoints/meta-llama/Llama-3.2-3B-Instruct-sft-0/consolidated.00.pth INFO 2025-02-26 21:39:24,333 torchtune.utils._logging:132: Adapter checkpoint of size 0.00 GB saved to /home/ec2-user/.llama/checkpoints/meta-llama/Llama-3.2-3B-Instruct-sft-0/adapter/adapter.pth model_file_path /home/ec2-user/.llama/checkpoints/meta-llama/Llama-3.2-3B-Instruct-sft-0 ``` [//]: # (## Documentation) Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>		2025-02-26 15:12:11 -08:00
..
agents	feat: allow specifying specific tool within toolgroup (#1239 )	2025-02-26 14:07:05 -08:00
datasetio	build: format codebase imports using ruff linter (#1028 )	2025-02-13 10:06:21 -08:00
eval	chore!: deprecate eval/tasks (#1186 )	2025-02-20 14:06:21 -08:00
inference	fix: resolve type hint issues and import dependencies (#1176 )	2025-02-25 11:06:47 -08:00
ios/inference	chore: removed executorch submodule (#1265 )	2025-02-25 21:57:21 -08:00
post_training	fix: don't attempt to clean gpu memory up when device is cpu (#1191 )	2025-02-26 15:12:11 -08:00
safety	chore: move all Llama Stack types from llama-models to llama-stack (#1098 )	2025-02-14 09:10:59 -08:00
scoring	feat: add aggregation_functions to llm_as_judge_405b_simpleqa (#1164 )	2025-02-19 19:42:04 -08:00
telemetry	fix: sqlite conn (#1282 )	2025-02-26 14:44:31 -08:00
tool_runtime	feat: remove special handling of builtin::rag tool (#1015 )	2025-02-26 13:04:52 -08:00
vector_io	Fix sqlite_vec config defaults	2025-02-20 17:50:33 -08:00
__init__.py	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00