mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-06-28 02:53:30 +00:00
- **refactor: simplify job status extraction a bit** - **torchtune: save job status on schedule** - **refactor: get rid of job_list in torchtune job management code** # What does this PR do? A failed job is now registered in API, and one can consult its status. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan ``` $ llama-stack-client post_training status --job-uuid test-jobe244b5b0-5053-4892-a4d9-d8fc8b116e73 JobStatusResponse(checkpoints=[], job_uuid='test-jobe244b5b0-5053-4892-a4d9-d8fc8b116e73', status='failed', completed_at=None, resources_allocated=None, scheduled_at=datetime.datetime(2025, 2, 18, 9, 4, 34, 3252), started_at=datetime.datetime(2025, 2, 18, 9, 4, 34, 10688)) ``` [//]: # (## Documentation) --------- Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com> |
||
---|---|---|
.. | ||
agents | ||
datasetio | ||
eval | ||
inference | ||
ios/inference | ||
post_training | ||
safety | ||
scoring | ||
telemetry | ||
tool_runtime | ||
vector_io | ||
__init__.py |