llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-07-18 10:52:28 +00:00

Author	SHA1	Message	Date
Botao Chen	f369871083	feat: [New Eval Benchamark] IfEval (#1708 ) # What does this PR do? In this PR, we added a new eval open benchmark IfEval based on paper https://arxiv.org/abs/2311.07911 to measure the model capability of instruction following. ## Test Plan spin up a llama stack server with open-benchmark template run `llama-stack-client --endpoint xxx eval run-benchmark "meta-reference-ifeval" --model-id "meta-llama/Llama-3.3-70B-Instruct" --output-dir "/home/markchen1015/" --num-examples 20` on client side and get the eval aggregate results	2025-03-19 16:39:59 -07:00
Yuan Tang	22e560351e	ci: Add scheduled workflow to update changelog (#1503 ) # What does this PR do? This is a follow up from https://github.com/meta-llama/llama-stack/pull/1463. cc @yanxi0830 --------- Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Co-authored-by: Sébastien Han <seb@redhat.com>	2025-03-18 14:39:22 -07:00
Yuan Tang	d609ffce2a	chore: Add links and badges to both unit and integration tests (#1632 ) # What does this PR do? This makes it easier to know the statuses of both and identifying failed builds. Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-03-18 14:12:17 -07:00
Sébastien Han	ffe9b3b278	ci(ollama): run more integration tests (#1636 ) # What does this PR do? Run additional tests in a matrix to accelerate the process and clearly identify failing providers. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-03-18 08:54:42 -07:00
Nathan Weinberg	3b35a39b8b	ci: limit PR testing based on modified files (#1644 ) # What does this PR do? rather than have unit and functional tests run on all PRs, we should only have them run on PRs changing relevant files Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-03-17 15:20:29 -07:00
Sébastien Han	24fd06879e	refactor: simplify command execution and remove PTY handling (#1641 ) # What does this PR do? A PTY is unnecessary for interactive mode since `subprocess.run()` already inherits the calling terminal’s stdin, stdout, and stderr, allowing natural interaction. Using a PTY can introduce unwanted side effects like buffering issues and inconsistent signal handling. Standard input/output is sufficient for most interactive programs. This commit simplifies the command execution by: 1. Removing PTY-based execution in favor of direct subprocess handling 2. Consolidating command execution into a single run_command function 3. Improving error handling with specific subprocess error types 4. Adding proper type hints and documentation 5. Maintaining Ctrl+C handling for graceful interruption ## Test Plan ``` llama stack run ``` Signed-off-by: Sébastien Han <seb@redhat.com>	2025-03-17 15:03:14 -07:00
Ihar Hrachyshka	bfc79217a8	chore: Add ./scripts/unit-tests.sh (#1515 ) # What does this PR do? Useful for local development. Now you can just trigger the script and not care about specific arguments to pass to run unit tests. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan ``` $ . ./venv/bin/activate $ ./scripts/run_tests.sh $ echo $? 0 ``` [//]: # (## Documentation) Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com> Co-authored-by: Nathan Weinberg <31703736+nathan-weinberg@users.noreply.github.com>	2025-03-13 20:25:15 -07:00
dependabot[bot]	e101d15f12	build(deps): bump astral-sh/setup-uv from 4 to 5 (#1620 )	2025-03-13 16:40:15 -04:00
Sébastien Han	28aade9a27	ci: add GitHub Action to close stale issues and PRs (#1613 ) # What does this PR do? - Issues/PRs inactive for 60 days are marked as stale - Stale items are closed after 30 additional days of inactivity - Adds appropriate warning and closing messages - Sets daily schedule for stale checks Signed-off-by: Sébastien Han <seb@redhat.com>	2025-03-13 12:09:04 -07:00
Sébastien Han	edfcb02a0e	ci(ollama): add GitHub Actions workflow for integration tests (#1546 ) # What does this PR do? Added a GitHub Action to run inference tests for the Ollama provider. This ensures we have coverage for Ollama integration. --------- Signed-off-by: Sébastien Han <seb@redhat.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>	2025-03-13 12:04:53 -07:00
Sébastien Han	5e54113b19	ci: add dynamic CI job to test templates (#1230 ) # What does this PR do? Introduced a new CI job that dynamically generates a build matrix based on available templates from `llama_stack/templates/*/build.yaml`. This allows automated testing for all templates without manual intervention. The CI currently builds for venv and containers. Signed-off-by: Sébastien Han <seb@redhat.com> ~Will pass once https://github.com/meta-llama/llama-stack/pull/1228 merges.~ Signed-off-by: Sébastien Han <seb@redhat.com>	2025-03-13 10:14:01 -07:00
Nathan Weinberg	2baf200b63	ci: add html report to unit test artifacts (#1576 ) # What does this PR do? additional artifacts make test results more human-readable ## Test Plan Ran locally Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-03-12 19:05:49 -07:00
Nathan Weinberg	ad939c97c3	docs: add unit test badge to README (#1591 ) # What does this PR do? This PR adds a simple unit test badge to the project README It also modifies the workflow to run on merges to main, so that the status reflected in the README is that of main and not pull request branches --------- Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-03-12 15:41:35 -07:00
Nathan Weinberg	00da911167	ci: run unit tests on all supported python versions (#1575 ) # What does this PR do? python unit tests running via GitHub Actions were only running with python 3.10 the project supports all python versions greater than or equal to 3.10 this commit adds 3.11, 3.12, and 3.13 to the test matrix for better coverage and confidence for non-3.10 users ## Test Plan All tests pass locally with python 3.11, 3.12, and 3.13 Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-03-12 09:55:11 -07:00
Nathan Weinberg	275bab1373	test: loosen Python 3.10 version for unit tests (#1547 ) # What does this PR do? as I brought up in #1515 it shouldn't be nessessary to tie the unit test runner to an exact z-stream of Python 3.10 updated so unit test runner always uses latest z-stream of Python 3.10 ## Test Plan ```shell $ uv run -p 3.10 --with-editable . --with-editable ".[dev]" --with-editable ".[unit]" pytest --cov=llama_stack -s -v tests/unit/ --junitxml=pytest-report.xml ``` Signed-off-by: Nathan Weinberg <nweinber@redhat.com>	2025-03-11 11:11:32 -07:00
Courtney Pacheco	ff853ccc38	fix: Use `--with-editable` to capture accurate code coverage reporting (#1532 ) # What does this PR do? I created a PR earlier today, but I realized the code coverage reporting isn't correct: #1512 Essentially, we need to use `--with-editable` to enable develop/editable mode through `uv`. Using editable mode will create a package.egg-link file, and that allows pytest to accurately capture code coverage. Before, some files had "0%" or "100%" coverage, which isn't accurate: <img width="1455" alt="Screenshot 2025-03-10 at 10 01 53 AM" src="https://github.com/user-attachments/assets/c425515a-9ecd-4962-a2d4-18cd16d12f25" /> More info on `--with-editable`: https://docs.astral.sh/uv/reference/cli/#uv-run--with-editable [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan Tested locally <img width="775" alt="Screenshot 2025-03-10 at 7 00 14 PM" src="https://github.com/user-attachments/assets/31141318-5cf6-4666-8676-b5d8c8d2e719" /> Screenshot from CI: <img width="1000" alt="Screenshot 2025-03-10 at 7 07 57 PM" src="https://github.com/user-attachments/assets/47092909-ff8d-4e97-80dc-2a16d948405a" /> [//]: # (## Documentation) Signed-off-by: Courtney Pacheco <6019922+courtneypacheco@users.noreply.github.com>	2025-03-10 19:30:28 -04:00
Sébastien Han	91b1b92908	build: revamp "test" dependencies from pyproject (#1468 ) # What does this PR do? The `test` section has been updated to include only the essential dependencies needed for running integration tests, which are shared across all providers. If a provider requires additional dependencies, please add them to your environment separately. When using uv to run your tests, you can specify extra dependencies with the `--with` flag. Signed-off-by: Sébastien Han <seb@redhat.com>	2025-03-10 15:43:16 -07:00
Courtney Pacheco	6dbac3beed	chore: Display code coverage for unit tests in PR builds (#1512 ) # What does this PR do? This PR allows for unit test code coverage % to be reported in PR builds. Currently, today's output tells the end user which tests passed and which tests failed: <img width="744" alt="Screenshot 2025-03-10 at 9 44 28 AM" src="https://github.com/user-attachments/assets/40b1a578-951f-4b74-8a37-a39c039b1d7e" /> If a contributor is creating a new module within Llama Stack and starts writing unit tests for that module, it might be difficult for Llama Stack maintainers to immediately determine the code coverage percentage for that new module. To allow for code coverage reporting in the CI, we simply need to install `pytest-cov` so we can use the `--cov` flag with the existing `pytest` command. Ideally, it would be nicer to have a bot report code coverage, but this PR can be a temporary solution. [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan I ran these changes locally: <img width="1455" alt="Screenshot 2025-03-10 at 10 01 53 AM" src="https://github.com/user-attachments/assets/dfd765c6-5979-42a3-b899-7713a3f202e6" /> PR build to confirm the expected behavior: <img width="1326" alt="Screenshot 2025-03-10 at 12 47 36 PM" src="https://github.com/user-attachments/assets/fe94f1e6-fbb5-4e57-9902-197502c50621" /> [//]: # (## Documentation) Signed-off-by: Courtney Pacheco <6019922+courtneypacheco@users.noreply.github.com>	2025-03-10 16:27:33 -04:00
Ashwin Bharambe	ba917a9c48	fix: make sure readthedocs is triggered if pyproject.toml is updated	2025-03-08 23:05:10 -08:00
dependabot[bot]	d63e798f6d	build(deps): bump thollander/actions-comment-pull-request from 2 to 3 (#1485 )	2025-03-07 17:31:53 -05:00
dependabot[bot]	9506012736	build(deps): bump actions/upload-artifact from 3 to 4 (#1486 )	2025-03-07 17:31:00 -05:00
Ashwin Bharambe	82e94fe22f	ci: add Github workflow which runs unittests in PR (#1442 )	2025-03-05 21:23:28 -05:00
Sébastien Han	33a64eb5ec	ci: improve GitHub Actions workflow for website builds (#1151 ) # What does this PR do? Refine the existing update-readthedocs.yml workflow to enhance automation and reliability. Updates include: - Expanding path triggers to cover all documentation files (docs/**) and build artifacts. - Adding steps to set up Python (3.11), install uv, sync dependencies, and build HTML using make html. - Ensuring the ReadTheDocs build trigger only runs on workflow_dispatch events. These improvements help validate website builds in PRs, preventing issues before merging. Signed-off-by: Sébastien Han <seb@redhat.com> Signed-off-by: Sébastien Han <seb@redhat.com>	2025-02-20 21:37:37 -08:00
Sébastien Han	371f11a569	build: update uv lock to sync package versions (#1026 ) # What does this PR do? [Provide a short summary of what this PR does and why. Link to relevant issues if applicable.] Updated `uv.lock` to reflect the latest versions of `llama-models`, `llama-stack`, and `llama-stack-client` (bumped to 0.1.2). This ensures dependency consistency and avoids potential issues with outdated package references. Added `uv-sync` hook from `uv-pre-commit` repository to ensure synchronization of dependencies. Signed-off-by: Sébastien Han <seb@redhat.com> [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed.] [//]: # (## Documentation) [//]: # (- [ ] Added a Changelog entry if the change is significant) Signed-off-by: Sébastien Han <seb@redhat.com>	2025-02-10 11:42:30 -05:00
Yuan Tang	c97e05f75e	test: Split inference tests to text and vision (#1008 ) # What does this PR do? This PR splits the inference tests into text and vision to make testing on vLLM provider easier as mentioned in https://github.com/meta-llama/llama-stack/pull/951 since serving multiple models (e.g. Llama-3.2-11B-Vision-Instruct and Llama-3.1-8B-Instruct) on a single port using the OpenAI API is [not supported yet](https://docs.vllm.ai/en/v0.5.5/serving/faq.html) so it's a bit tricky to test both at the same time. ## Test Plan All previously passing tests related to text still pass: `LLAMA_STACK_BASE_URL=http://localhost:5002 pytest -v tests/client-sdk/inference/test_text_inference.py` All vision tests passed via `LLAMA_STACK_BASE_URL=http://localhost:5002 pytest -v tests/client-sdk/inference/test_vision_inference.py`. Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-02-07 09:35:49 -08:00
Yuan Tang	dd1265bea7	ci: Add semantic PR title check (#979 ) This adds a new workflow to check semantic PR titles to match the [Conventional Commits spec](https://www.conventionalcommits.org/). This will make it easier to browse commit history and enable automation in the future. --------- Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>	2025-02-06 12:22:34 -08:00
Ashwin Bharambe	981bb52b59	Quote the token properly	2025-02-04 11:44:29 -08:00
Ashwin Bharambe	5005939494	Use a secret again for the workflow	2025-02-04 11:42:47 -08:00
Ashwin Bharambe	7392daddee	Try a new webhook	2025-02-04 11:36:54 -08:00
Ashwin Bharambe	2987fb37c3	fixes?	2025-02-04 11:34:27 -08:00
Ashwin Bharambe	766b11f1f8	Debug workflow	2025-02-04 11:09:16 -08:00
Ashwin Bharambe	5233666143	Debug workflow	2025-02-04 11:07:04 -08:00
Ashwin Bharambe	b35930a7e5	rename	2025-02-04 11:02:45 -08:00
Ashwin Bharambe	ea538e4b32	Add a workflow to trigger readthedocs rebuild	2025-02-04 11:02:06 -08:00
Ashwin Bharambe	1bb74d95ad	Delete CI workflows from here since they have moved to llama-stack-ops	2025-02-02 10:22:48 -08:00
Ashwin Bharambe	5b1e69e58e	Use `uv pip install` instead of `pip install` (#921 ) ## What does this PR do? See issue: #747 -- `uv` is just plain better. This PR does the bare minimum of replacing `pip install` by `uv pip install` and ensuring `uv` exists in the environment. ## Test Plan First: create new conda, `uv pip install -e .` on `llama-stack` -- all is good. Next: run `llama stack build --template together` followed by `llama stack run together` -- all good Next: run `llama stack build --template together --image-name yoyo` followed by `llama stack run together --image-name yoyo` -- all good Next: fresh conda and `uv pip install -e .` and `llama stack build --template together --image-type venv` -- all good. Docker: `llama stack build --template together --image-type container` works!	2025-01-31 22:29:41 -08:00
Sixian Yi	6f9023d948	create a github action for triggering client-sdk tests on new pull-request (#850 ) # What does this PR do? Create a new github action that runs integration tests on fireworks and together distro upon new PR Key features: 1) Run inference client-sdk tests on fireworks and together distro. Load distro as a library 2) Pull changes from latest github repo (llama-models) and (llama-stack-client-python) 3) output a test summary Next steps: - Expand the ci test action to (llama-models) and (llama-stack-client-python) repo to make sure the changes there does not break the imports in llama-stack ## Test Plan See [the job run triggered by this PR](`1292666319`)	2025-01-29 21:26:04 -08:00
Ashwin Bharambe	d111bad2f2	Update GH action so it correctly queries for test.pypi, etc. (#875 ) The previous curl command was wrong and did not actually check for version correctly (status code was always 200 regardless of what you retrieved.) Also added tagging latest. cc @wukaixingxp	2025-01-24 11:56:29 -08:00
Dinesh Yeduguru	d0be9288a3	Llama_Stack_Building_AI_Applications.ipynb -> getting_started.ipynb (#854 ) Llama_Stack_Building_AI_Applications.ipynb -> getting_started.ipynb	2025-01-23 12:04:06 -08:00
Xi Yan	74f6af8bbe	[CICD] add simple test step for docker build workflow, fix prefix bug (#821 ) # What does this PR do? Main Thing - Add a simple test step before publishing docker image in workflow Side Fix - Docker push action fails recently due to extra prefix introduced. E.g. see: https://github.com/meta-llama/llama-stack/pull/802#issuecomment-2599507062 cc @terrytangyuan ## Test Plan 1. Release a TestPyPi version on this code: 0.0.63.dev51206766 `3581203331` ``` # 1. build docker image TEST_PYPI_VERSION=0.0.63.dev51206766 llama stack build --template fireworks # 2. test the docker image cd distributions/fireworks && docker compose up ``` 4. Test the full build + test docker flow using TestPyPi from (1): `1284218494` <img width="1049" alt="image" src="https://github.com/user-attachments/assets/c025893d-5ce2-48ff-aa90-de00e105ee09" /> ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-18 15:16:05 -08:00
Yuan Tang	5379eca9fd	Fix incorrect image type in publish-to-docker workflow (#819 )	2025-01-17 21:33:03 -08:00
Xi Yan	c2a072911d	fix eval notebook & add test to workflow (#803 )	2025-01-16 23:11:21 -08:00
Hardik Shah	821ac674ab	Add notebook testing to nightly build job (#785 ) # What does this PR do? Adds testing of the notebook to the nightly build job ## Test Plan Here is a sample run -- `1281588919` --------- Co-authored-by: Hardik Shah <hjshah@fb.com>	2025-01-16 11:24:50 -08:00
Xi Yan	32d3abe964	[CICD] Github workflow for publishing Docker images (#764 ) # What does this PR do? - Add Github workflow for publishing docker images. - Manual Inputs - We can use a (1) TestPyPi version / (2) build via released PyPi version Notes - Keep this workflow manually triggered as we don't want to publish nightly docker images Additional Changes - Resolve issue with running llama stack build in non-terminal device ``` File "/home/runner/.local/lib/python3.12/site-packages/llama_stack/distribution/utils/exec.py", line 25, in run_with_pty old_settings = termios.tcgetattr(sys.stdin) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ termios.error: (25, 'Inappropriate ioctl for device') ``` - Modified build_container.sh to work in non-terminal environment ## Test Plan - Triggered workflow: `3562217878` <img width="1076" alt="image" src="https://github.com/user-attachments/assets/f1b5cef6-05ab-49c7-b405-53abc9264734" /> - Tested published docker image <img width="702" alt="image" src="https://github.com/user-attachments/assets/e7135189-65c8-45d8-86f9-9f3be70e380b" /> - /tools API endpoints are served so that docker is correctly using the TestPyPi package <img width="296" alt="image" src="https://github.com/user-attachments/assets/bbcaa7fe-c0a4-4d22-b600-90e3c254bbfd" /> - Published tagged images: https://hub.docker.com/repositories/llamastack <img width="947" alt="image" src="https://github.com/user-attachments/assets/2a0a0494-4d45-4643-bc29-72154ecc54a5" /> ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-15 09:01:33 -08:00
Xi Yan	ace8dd6087	[CI/CD] more robust re-try for downloading testpypi package (#749 ) # What does this PR do? - Context: Our current `sleep 10` may not be enough time for uploaded testpypi to be able to be downloadable. - Solution: Add re-try logic for at most 1 minute to download testpypi package and test the downloaded package. ## Test Plan - Triggered workflow: `3554549062` <img width="1673" alt="image" src="https://github.com/user-attachments/assets/4e4a063b-1486-4053-8fd4-0d823bd3651c" /> ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.	2025-01-13 17:53:38 -08:00
Xi Yan	6d85284abd	[CICD] github workflow to push nightly package to testpypi (#734 ) # What does this PR do? - Set up github workflow to push nightly package to testpypi ## How it works / Test Plan 1. Get the version for release package based on how push happens. 2. Trigger workflow in llama-stack-client & llama-models to build a package using the version: - llama-stack workflow: `1270242557` - llama-stack-client workflow: `1270242767` - llama-models workflow: `1270242774` 3. Wait for the workflows to finish. 3. After client and models package workflow finishes is pushed, update llama-stack package version & requirements. Then push a package for llama-stack. <img width="1218" alt="image" src="https://github.com/user-attachments/assets/04072953-31d2-43d1-9ebc-2b63d03d5fa4" /> 4. Simple tests on published package <img width="1428" alt="image" src="https://github.com/user-attachments/assets/b61696a1-985d-45e4-a44a-51155447d74c" /> ## Verify the updated package ``` pip install --index-url https://pypi.org/simple/ --extra-index-url https://test.pypi.org/simple/ llama-stack==0.0.64.dev20250110 llama stack build --template fireworks --image-type conda llama stack run fireworks ``` <img width="460" alt="image" src="https://github.com/user-attachments/assets/a12c5a3c-4830-4b7c-bf5a-6a97d4c3a530" /> ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests. --------- Signed-off-by: Yuan Tang <terrytangyuan@gmail.com> Co-authored-by: Yuan Tang <terrytangyuan@gmail.com>	2025-01-10 17:01:51 -08:00
Chacksu	144abd2e71	Introduce GitHub Actions Workflow for Llama Stack Tests (#523 ) # What does this PR do? Initial implementation of GitHub Actions workflow for automated testing of Llama Stack. ## Key Features - Automatically runs tests on pull requests and manual dispatch - Provides support for GPU required model tests - Reports test results and uploads summaries	2024-12-04 15:42:55 -08:00
Yuan Tang	a2b87ed0cb	Switch to pre-commit/action (#239 )	2024-10-11 11:09:11 -07:00
Yuan Tang	05282d1234	Enable pre-commit on main branch (#237 )	2024-10-11 10:03:59 -07:00
Russell Bryant	eba9d1ea14	ci: Run pre-commit checks in CI (#176 ) Run the pre-commit checks in a github workflow to validate that a PR or a direct push to the repo does not introduce new errors.	2024-10-10 11:21:59 -07:00

1 2

100 commits