mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-07-08 14:54:35 +00:00
currently only the last saved model is reported as a checkpoint and associated with the job UUID. since the HF trainer handles checkpoint collection during training, we need to add all of the `checkpoint-*` folders as Checkpoint objects. Adjust the save strategy to be per-epoch to make this easier and to use less storage Signed-off-by: Charlie Doern <cdoern@redhat.com> |
||
---|---|---|
.. | ||
agents | ||
datasetio | ||
eval | ||
files/localfs | ||
inference | ||
ios/inference | ||
post_training | ||
safety | ||
scoring | ||
telemetry | ||
tool_runtime | ||
vector_io | ||
__init__.py |