mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-27 21:31:59 +00:00
currently this impl hangs because of `trainer.train()` blocking. Re-write the implementation to kick off the model download, device instantiation, dataset processing, and training in a monitored subprocess. All of these steps need to be in a subprocess or else different devices are used which causes torch errors. Signed-off-by: Charlie Doern <cdoern@redhat.com> |
||
|---|---|---|
| .. | ||
| common | ||
| huggingface | ||
| torchtune | ||
| __init__.py | ||