llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-08-17 15:03:15 +00:00

Author	SHA1	Message	Date
Nehanth Narendrula	cf73146132	feat: Enable DPO training with HuggingFace inline provider (#2825 ) Some checks failed Integration Tests / discover-tests (push) Has been skipped Details Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 7s Details Integration Tests / record-tests (push) Has been skipped Details Integration Tests / run-tests (push) Has been skipped Details Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 22s Details Python Package Build Test / build (3.13) (push) Failing after 16s Details Test Llama Stack Build / generate-matrix (push) Successful in 19s Details Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 21s Details Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 31s Details Test Llama Stack Build / build-custom-container-distribution (push) Failing after 32s Details Test External API and Providers / test-external (venv) (push) Failing after 32s Details Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 36s Details Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 39s Details Update ReadTheDocs / update-readthedocs (push) Failing after 31s Details SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 42s Details Test Llama Stack Build / build-single-provider (push) Failing after 37s Details Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Failing after 35s Details Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 37s Details Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 40s Details Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 42s Details Unit Tests / unit-tests (3.12) (push) Failing after 36s Details Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 40s Details SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 45s Details Test Llama Stack Build / build (push) Failing after 6s Details Python Package Build Test / build (3.12) (push) Failing after 1m1s Details Unit Tests / unit-tests (3.13) (push) Failing after 1m0s Details Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 1m6s Details Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 1m8s Details Pre-commit / pre-commit (push) Successful in 1m50s Details What does this PR do? This PR adds support for Direct Preference Optimization (DPO) training via the existing HuggingFace inline provider. It introduces a new DPO training recipe, config schema updates, dataset integration, and end-to-end testing to support preference-based fine-tuning with TRL. Test Plan Added integration test: tests/integration/post_training/test_post_training.py::TestPostTraining::test_preference_optimize Ran tests on both CPU and CUDA environments --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-43-83.ec2.internal> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-07-30 23:33:36 -07:00
Charlie Doern	f02f7b28c1	feat: add huggingface post_training impl (#2132 ) # What does this PR do? adds an inline HF SFTTrainer provider. Alongside touchtune -- this is a super popular option for running training jobs. The config allows a user to specify some key fields such as a model, chat_template, device, etc the provider comes with one recipe `finetune_single_device` which works both with and without LoRA. any model that is a valid HF identifier can be given and the model will be pulled. this has been tested so far with CPU and MPS device types, but should be compatible with CUDA out of the box The provider processes the given dataset into the proper format, establishes the various steps per epoch, steps per save, steps per eval, sets a sane SFTConfig, and runs n_epochs of training if checkpoint_dir is none, no model is saved. If there is a checkpoint dir, a model is saved every `save_steps` and at the end of training. ## Test Plan re-enabled post_training integration test suite with a singular test that loads the simpleqa dataset: https://huggingface.co/datasets/llamastack/simpleqa and a tiny granite model: https://huggingface.co/ibm-granite/granite-3.3-2b-instruct. The test now uses the llama stack client and the proper post_training API runs one step with a batch_size of 1. This test runs on CPU on the Ubuntu runner so it needs to be a small batch and a single step. [//]: # (## Documentation) --------- Signed-off-by: Charlie Doern <cdoern@redhat.com>	2025-05-16 14:41:28 -07:00

2 commits