llama-stack-mirror/llama_stack/providers/inline
Ihar Hrachyshka 2250ab7274
fix: don't attempt to clean gpu memory up when device is cpu (#1191)
This is a follow up to:
https://github.com/meta-llama/llama-stack/pull/1140

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>

# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]

Avoid unnecessary GPU memory clean attempt when the GPU is not used for
training.

[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])

## Test Plan

With CPU:

```
INFO 2025-02-26 16:43:56,267 torchtune.utils._logging:121: Model checkpoint of size 6.43 GB saved to /Users/ihrachys/.llama/checkpoints/meta-llama/Llama-3.2-3B-Instruct-sft-0/consolidated.00.pth
INFO 2025-02-26 16:43:56,274 torchtune.utils._logging:132: Adapter checkpoint of size 0.00 GB saved to /Users/ihrachys/.llama/checkpoints/meta-llama/Llama-3.2-3B-Instruct-sft-0/adapter/adapter.pth
model_file_path /Users/ihrachys/.llama/checkpoints/meta-llama/Llama-3.2-3B-Instruct-sft-0
```

With CUDA:

```
INFO 2025-02-26 21:39:24,314 torchtune.utils._logging:121: Model checkpoint of size 6.43 GB saved to /home/ec2-user/.llama/checkpoints/meta-llama/Llama-3.2-3B-Instruct-sft-0/consolidated.00.pth
INFO 2025-02-26 21:39:24,333 torchtune.utils._logging:132: Adapter checkpoint of size 0.00 GB saved to /home/ec2-user/.llama/checkpoints/meta-llama/Llama-3.2-3B-Instruct-sft-0/adapter/adapter.pth
model_file_path /home/ec2-user/.llama/checkpoints/meta-llama/Llama-3.2-3B-Instruct-sft-0
```

[//]: # (## Documentation)

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
2025-02-26 15:12:11 -08:00
..
agents feat: allow specifying specific tool within toolgroup (#1239) 2025-02-26 14:07:05 -08:00
datasetio build: format codebase imports using ruff linter (#1028) 2025-02-13 10:06:21 -08:00
eval chore!: deprecate eval/tasks (#1186) 2025-02-20 14:06:21 -08:00
inference fix: resolve type hint issues and import dependencies (#1176) 2025-02-25 11:06:47 -08:00
ios/inference chore: removed executorch submodule (#1265) 2025-02-25 21:57:21 -08:00
post_training fix: don't attempt to clean gpu memory up when device is cpu (#1191) 2025-02-26 15:12:11 -08:00
safety chore: move all Llama Stack types from llama-models to llama-stack (#1098) 2025-02-14 09:10:59 -08:00
scoring feat: add aggregation_functions to llm_as_judge_405b_simpleqa (#1164) 2025-02-19 19:42:04 -08:00
telemetry fix: sqlite conn (#1282) 2025-02-26 14:44:31 -08:00
tool_runtime feat: remove special handling of builtin::rag tool (#1015) 2025-02-26 13:04:52 -08:00
vector_io Fix sqlite_vec config defaults 2025-02-20 17:50:33 -08:00
__init__.py impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00