llama-stack-mirror

phoenix-oss/llama-stack-mirror

Fork 1

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-27 23:31:59 +00:00

Commit graph

Author	SHA1	Message	Date
Sébastien Han	fb3c9be1fd	refactor(api): rename "files" API to "artifacts" The term "artifacts" better represents the purpose of this API, which handles outputs generated by API executions, eventually stored objects that can be of served by any storage interface (file, objects). This aligns better with the industry convention of 'artifacts' (build outputs, process results) rather than generic 'files'. 'files' would be appropriate if the goal was to store and retrieve files purely. Additionally, in our context, artifact is a better term since it will handle: * Data produced by SDG (Synthetic Data Generation) - as input * Output of a trained model - as output Signed-off-by: Sébastien Han <seb@redhat.com>	2025-05-12 21:52:14 +02:00
Ihar Hrachyshka	9e6561a1ec	chore: enable pyupgrade fixes (#1806 ) # What does this PR do? The goal of this PR is code base modernization. Schema reflection code needed a minor adjustment to handle UnionTypes and collections.abc.AsyncIterator. (Both are preferred for latest Python releases.) Note to reviewers: almost all changes here are automatically generated by pyupgrade. Some additional unused imports were cleaned up. The only change worth of note can be found under `docs/openapi_generator` and `llama_stack/strong_typing/schema.py` where reflection code was updated to deal with "newer" types. Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>	2025-05-01 14:23:50 -07:00
Ihar Hrachyshka	3ed4316ed5	feat: Implement async job execution for torchtune training (#1437 ) # What does this PR do? Now a separate thread is started to execute training jobs. Training requests now return job ID before the job completes. (Which fixes API timeouts for any jobs that take longer than a minute.) Note: the scheduler code is meant to be spun out in the future into a common provider service that can be reused for different APIs and providers. It is also expected to back the /jobs API proposed here: https://github.com/meta-llama/llama-stack/discussions/1238 Hence its somewhat generalized form which is expected to simplify its adoption elsewhere in the future. Note: this patch doesn't attempt to implement missing APIs (e.g. cancel or job removal). This work will belong to follow-up PRs. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] Added unit tests for the scheduler module. For the API coverage, did manual testing and was able to run a training cycle on GPU. The initial call returned job ID before the training completed, as (now) expected. Artifacts are returned as expected. ``` JobArtifactsResponse(checkpoints=[{'identifier': 'meta-llama/Llama-3.2-3B-Instruct-sft-0', 'created_at': '2025-03-07T22:45:19.892714', 'epoch': 0, 'post_training_job_id': 'test-job2ee77104-2fd3-4a4e-84cf-f83f8b8f1f50', 'path': '/home/ec2-user/.llama/checkpoints/meta-llama/Llama-3.2-3B-Instruct-sft-0', 'training_metrics': None}], job_uuid='test-job2ee77104-2fd3-4a4e-84cf-f83f8b8f1f50') ``` The integration test is currently disabled for the provider. I will look into how it can be enabled in a different PR / issue context. [//]: # (## Documentation) Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>	2025-04-14 08:59:11 -07:00

Author

SHA1

Message

Date

Sébastien Han

fb3c9be1fd

refactor(api): rename "files" API to "artifacts"

The term "artifacts" better represents the purpose of this API, which
handles outputs generated by API executions, eventually stored objects
that can be of served by any storage interface (file, objects).

This aligns better with the industry convention of 'artifacts' (build
outputs, process results) rather than generic 'files'. 'files' would
be appropriate if the goal was to store and retrieve files purely.

Additionally, in our context, artifact is a better term since it will
handle:

* Data produced by SDG (Synthetic Data Generation) - as input
* Output of a trained model - as output

Signed-off-by: Sébastien Han <seb@redhat.com>

2025-05-12 21:52:14 +02:00

Ihar Hrachyshka

9e6561a1ec

chore: enable pyupgrade fixes (#1806 )

# What does this PR do?

The goal of this PR is code base modernization.

Schema reflection code needed a minor adjustment to handle UnionTypes
and collections.abc.AsyncIterator. (Both are preferred for latest Python
releases.)

Note to reviewers: almost all changes here are automatically generated
by pyupgrade. Some additional unused imports were cleaned up. The only
change worth of note can be found under `docs/openapi_generator` and
`llama_stack/strong_typing/schema.py` where reflection code was updated
to deal with "newer" types.

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>

2025-05-01 14:23:50 -07:00

Ihar Hrachyshka

3ed4316ed5

feat: Implement async job execution for torchtune training (#1437 )

# What does this PR do?

Now a separate thread is started to execute training jobs. Training
requests now return job ID before the job completes. (Which fixes API
timeouts for any jobs that take longer than a minute.)

Note: the scheduler code is meant to be spun out in the future into a
common provider service that can be reused for different APIs and
providers. It is also expected to back the /jobs API proposed here:

https://github.com/meta-llama/llama-stack/discussions/1238

Hence its somewhat generalized form which is expected to simplify its
adoption elsewhere in the future.

Note: this patch doesn't attempt to implement missing APIs (e.g. cancel
or job removal). This work will belong to follow-up PRs.

[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])

## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]

Added unit tests for the scheduler module. For the API coverage, did
manual testing and was able to run a training cycle on GPU. The initial
call returned job ID before the training completed, as (now) expected.
Artifacts are returned as expected.

```
JobArtifactsResponse(checkpoints=[{'identifier': 'meta-llama/Llama-3.2-3B-Instruct-sft-0', 'created_at': '2025-03-07T22:45:19.892714', 'epoch': 0, 'post_training_job_id': 'test-job2ee77104-2fd3-4a4e-84cf-f83f8b8f1f50', 'path': '/home/ec2-user/.llama/checkpoints/meta-llama/Llama-3.2-3B-Instruct-sft-0', 'training_metrics': None}], job_uuid='test-job2ee77104-2fd3-4a4e-84cf-f83f8b8f1f50')
```

The integration test is currently disabled for the provider. I will look
into how it can be enabled in a different PR / issue context.

[//]: # (## Documentation)

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>

2025-04-14 08:59:11 -07:00

3 commits