llama-stack-mirror/tests/unit
Ihar Hrachyshka 2433ef218d feat: implement async job scheduler for torchtune
Now a separate thread is started to execute training jobs. Training
requests now return job ID before the job completes. (Which fixes API
timeouts for any jobs that take longer than a minute.)

Note: the scheduler code is meant to be spun out in the future into a
common provider service that can be reused for different APIs and
providers. It is also expected to back the /jobs API proposed here:

https://github.com/meta-llama/llama-stack/discussions/1238

Hence its somewhat generalized form which is expected to simplify its
adoption elsewhere in the future.

Note: this patch doesn't attempt to implement missing APIs (e.g. cancel
or job removal). This work will belong to follow-up PRs.

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
2025-03-28 12:11:59 -04:00
..
cli refactor: tests/unittests -> tests/unit; tests/api -> tests/integration 2025-03-04 09:57:00 -08:00
models fix: update default tool call system prompt (#1712) 2025-03-19 22:49:24 -07:00
providers feat: implement async job scheduler for torchtune 2025-03-28 12:11:59 -04:00
rag chore: Get sqlite_vec and vector_store unit tests passing (#1413) 2025-03-05 13:20:13 -05:00
registry fix: handle registry errors gracefully (#1732) 2025-03-20 15:24:07 -07:00
server feat(server): add attribute based access control for resources (#1703) 2025-03-19 21:28:52 -07:00