Commit graph

10 commits

Author SHA1 Message Date
Nehanth Narendrula
b41d696e4f
fix: Post Training Model change in Tests in order to make it less intensive (#2991)
# What does this PR do?

Changed from` ibm-granite/granite-3.3-2b-instruct` to`
HuggingFaceTB/SmolLM2-135M-Instruct` so it as not resource intensive in
CI

Idea came from -
https://github.com/meta-llama/llama-stack/pull/2984#issuecomment-3140400830
2025-07-31 11:22:34 -07:00
Nehanth Narendrula
3a574ef23c
fix: remove unused DPO parameters from schema and tests (#2988)
# What does this PR do?

I removed these DPO parameters from the schema in [this
PR](https://github.com/meta-llama/llama-stack/pull/2804), but I may not
have done it correctly, since they were reintroduced in [this
commit](cb7354a9ce (diff-4e9a8cb358213d6118c4b6ec2a76d0367af06441bf0717e13a775ade75e2061dR15081))—likely
due to a pre-commit hook.

I've made the changes again, and the pre-commit hook automatically
updated the spec sheet.
2025-07-31 09:11:08 -07:00
Charlie Doern
5c33bc1353
fix: post_training ci (#2984)
Some checks failed
Integration Tests / discover-tests (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 5s
Python Package Build Test / build (3.12) (push) Failing after 10s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 4s
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 25s
Test External API and Providers / test-external (venv) (push) Failing after 6s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 24s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 26s
Integration Tests / record-tests (push) Has been skipped
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 28s
Python Package Build Test / build (3.13) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 28s
Integration Tests / run-tests (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 31s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 26s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 29s
Unit Tests / unit-tests (3.13) (push) Failing after 12s
Unit Tests / unit-tests (3.12) (push) Failing after 14s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 27s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 42s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 40s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 45s
Pre-commit / pre-commit (push) Successful in 1m30s
2025-07-31 08:26:06 -07:00
Nehanth Narendrula
cf73146132
feat: Enable DPO training with HuggingFace inline provider (#2825)
Some checks failed
Integration Tests / discover-tests (push) Has been skipped
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 7s
Integration Tests / record-tests (push) Has been skipped
Integration Tests / run-tests (push) Has been skipped
Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 22s
Python Package Build Test / build (3.13) (push) Failing after 16s
Test Llama Stack Build / generate-matrix (push) Successful in 19s
Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 21s
Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 31s
Test Llama Stack Build / build-custom-container-distribution (push) Failing after 32s
Test External API and Providers / test-external (venv) (push) Failing after 32s
Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 36s
Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 39s
Update ReadTheDocs / update-readthedocs (push) Failing after 31s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 42s
Test Llama Stack Build / build-single-provider (push) Failing after 37s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 35s
Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 37s
Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 40s
Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 42s
Unit Tests / unit-tests (3.12) (push) Failing after 36s
Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 40s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 45s
Test Llama Stack Build / build (push) Failing after 6s
Python Package Build Test / build (3.12) (push) Failing after 1m1s
Unit Tests / unit-tests (3.13) (push) Failing after 1m0s
Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 1m6s
Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 1m8s
Pre-commit / pre-commit (push) Successful in 1m50s
What does this PR do?

This PR adds support for Direct Preference Optimization (DPO) training
via the existing HuggingFace inline provider. It introduces a new DPO
training recipe, config schema updates, dataset integration, and
end-to-end testing to support preference-based fine-tuning with TRL.

Test Plan

Added integration test:

tests/integration/post_training/test_post_training.py::TestPostTraining::test_preference_optimize

Ran tests on both CPU and CUDA environments

---------

Co-authored-by: Ubuntu <ubuntu@ip-172-31-43-83.ec2.internal>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2025-07-30 23:33:36 -07:00
Ashwin Bharambe
81c7d6fa2e
chore(ci): disable post training tests (#2953)
Post training tests need _much_ better thinking before we can re-enable
them to be run on every single PR. Running periodically should be
approached only when it is shown that the tests are reliable and as
light-weight as can be; otherwise, it is just kicking the can down the
road.
2025-07-29 14:20:09 -07:00
Matthew Farrellee
c7dc0f21b4
fix: error on failed job, do not wait for timeout (#2945)
# What does this PR do?

cause post training integration test to error when job fails.

## Test Plan

ci
2025-07-29 11:07:51 -07:00
Charlie Doern
d7cc38e934
fix: remove async test markers (fix pre-commit) (#2808)
# What does this PR do?

some async test markers are in the codebase causing pre-commit to fail
due to #2744

remove these pytest fixtures

## Test Plan
pre-commit passes

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-07-17 21:35:28 -07:00
Charlie Doern
f02f7b28c1
feat: add huggingface post_training impl (#2132)
# What does this PR do?


adds an inline HF SFTTrainer provider. Alongside touchtune -- this is a
super popular option for running training jobs. The config allows a user
to specify some key fields such as a model, chat_template, device, etc

the provider comes with one recipe `finetune_single_device` which works
both with and without LoRA.

any model that is a valid HF identifier can be given and the model will
be pulled.

this has been tested so far with CPU and MPS device types, but should be
compatible with CUDA out of the box

The provider processes the given dataset into the proper format,
establishes the various steps per epoch, steps per save, steps per eval,
sets a sane SFTConfig, and runs n_epochs of training

if checkpoint_dir is none, no model is saved. If there is a checkpoint
dir, a model is saved every `save_steps` and at the end of training.


## Test Plan

re-enabled post_training integration test suite with a singular test
that loads the simpleqa dataset:
https://huggingface.co/datasets/llamastack/simpleqa and a tiny granite
model: https://huggingface.co/ibm-granite/granite-3.3-2b-instruct. The
test now uses the llama stack client and the proper post_training API

runs one step with a batch_size of 1. This test runs on CPU on the
Ubuntu runner so it needs to be a small batch and a single step.

[//]: # (## Documentation)

---------

Signed-off-by: Charlie Doern <cdoern@redhat.com>
2025-05-16 14:41:28 -07:00
Ihar Hrachyshka
9e6561a1ec
chore: enable pyupgrade fixes (#1806)
# What does this PR do?

The goal of this PR is code base modernization.

Schema reflection code needed a minor adjustment to handle UnionTypes
and collections.abc.AsyncIterator. (Both are preferred for latest Python
releases.)

Note to reviewers: almost all changes here are automatically generated
by pyupgrade. Some additional unused imports were cleaned up. The only
change worth of note can be found under `docs/openapi_generator` and
`llama_stack/strong_typing/schema.py` where reflection code was updated
to deal with "newer" types.

Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
2025-05-01 14:23:50 -07:00
Ashwin Bharambe
abfbaf3c1b
refactor(test): move tools, evals, datasetio, scoring and post training tests (#1401)
All of the tests from `llama_stack/providers/tests/` are now moved to
`tests/integration`.

I converted the `tools`, `scoring` and `datasetio` tests to use API.
However, `eval` and `post_training` proved to be a bit challenging to
leaving those. I think `post_training` should be relatively
straightforward also.

As part of this, I noticed that `wolfram_alpha` tool wasn't added to
some of our commonly used distros so I added it. I am going to remove a
lot of code duplication from distros next so while this looks like a
one-off right now, it will go away and be there uniformly for all
distros.
2025-03-04 14:53:47 -08:00