# What does this PR do?
When multiple commits are pushed to a PR, multiple CI builds will be
triggered. This PR ensures that we only run one concurrent build for
each PR to reduce CI loads.
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
# What does this PR do?
In this PR, we added a new eval open benchmark IfEval based on paper
https://arxiv.org/abs/2311.07911 to measure the model capability of
instruction following.
## Test Plan
spin up a llama stack server with open-benchmark template
run `llama-stack-client --endpoint xxx eval run-benchmark
"meta-reference-ifeval" --model-id "meta-llama/Llama-3.3-70B-Instruct"
--output-dir "/home/markchen1015/" --num-examples 20` on client side and
get the eval aggregate results
# What does this PR do?
This is a follow up from
https://github.com/meta-llama/llama-stack/pull/1463. cc @yanxi0830
---------
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Co-authored-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
This makes it easier to know the statuses of both and identifying failed
builds.
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
# What does this PR do?
Run additional tests in a matrix to accelerate the process and clearly
identify failing providers.
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
This PR adds dependabot updates for Python dependencies. In addition:
* Consistent weekly schedule on a specific day
* Specific commit messages
* `open-pull-requests-limit` is intentional to avoid upgrading
dependencies that will likely cause regressions. We want to keep the
focus here on security updates only
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
# What does this PR do?
rather than have unit and functional tests run on all PRs, we should
only have them run on PRs changing relevant files
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
# What does this PR do?
A PTY is unnecessary for interactive mode since `subprocess.run()`
already inherits the calling terminal’s stdin, stdout, and stderr,
allowing natural interaction. Using a PTY can introduce unwanted side
effects like buffering issues and inconsistent signal handling. Standard
input/output is sufficient for most interactive programs.
This commit simplifies the command execution by:
1. Removing PTY-based execution in favor of direct subprocess handling
2. Consolidating command execution into a single run_command function
3. Improving error handling with specific subprocess error types
4. Adding proper type hints and documentation
5. Maintaining Ctrl+C handling for graceful interruption
## Test Plan
```
llama stack run
```
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
Useful for local development. Now you can just trigger the script and
not care about specific arguments to pass to run unit tests.
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
```
$ . ./venv/bin/activate
$ ./scripts/run_tests.sh
$ echo $?
0
```
[//]: # (## Documentation)
Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
Co-authored-by: Nathan Weinberg <31703736+nathan-weinberg@users.noreply.github.com>
# What does this PR do?
- Issues/PRs inactive for 60 days are marked as stale
- Stale items are closed after 30 additional days of inactivity
- Adds appropriate warning and closing messages
- Sets daily schedule for stale checks
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
Added a GitHub Action to run inference tests for the Ollama provider.
This ensures we have coverage for Ollama integration.
---------
Signed-off-by: Sébastien Han <seb@redhat.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
# What does this PR do?
Introduced a new CI job that dynamically generates a build matrix based
on available templates from `llama_stack/templates/*/build.yaml`.
This allows automated testing for all templates without manual
intervention.
The CI currently builds for venv and containers.
Signed-off-by: Sébastien Han <seb@redhat.com>
~Will pass once https://github.com/meta-llama/llama-stack/pull/1228
merges.~
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
additional artifacts make test results more human-readable
## Test Plan
Ran locally
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
# What does this PR do?
This PR adds a simple unit test badge to the project README
It also modifies the workflow to run on merges to main, so that the
status reflected in the README is that of main and not pull request
branches
---------
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
# What does this PR do?
python unit tests running via GitHub Actions were only running with
python 3.10
the project supports all python versions greater than or equal to 3.10
this commit adds 3.11, 3.12, and 3.13 to the test matrix for better
coverage and confidence for non-3.10 users
## Test Plan
All tests pass locally with python 3.11, 3.12, and 3.13
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
# What does this PR do?
as I brought up in #1515 it shouldn't be nessessary to tie the unit test
runner to an exact z-stream of Python 3.10
updated so unit test runner always uses latest z-stream of Python 3.10
## Test Plan
```shell
$ uv run -p 3.10 --with-editable . --with-editable ".[dev]" --with-editable ".[unit]" pytest --cov=llama_stack -s -v tests/unit/ --junitxml=pytest-report.xml
```
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
# What does this PR do?
The `test` section has been updated to include only the essential
dependencies needed for running integration tests, which are shared
across all providers. If a provider requires additional dependencies,
please add them to your environment separately. When using uv to
run your tests, you can specify extra dependencies with the
`--with` flag.
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
This PR allows for unit test code coverage % to be reported in PR
builds. Currently, today's output tells the end user which tests passed
and which tests failed:
<img width="744" alt="Screenshot 2025-03-10 at 9 44 28 AM"
src="https://github.com/user-attachments/assets/40b1a578-951f-4b74-8a37-a39c039b1d7e"
/>
If a contributor is creating a new module within Llama Stack and starts
writing unit tests for that module, it might be difficult for Llama
Stack maintainers to immediately determine the code coverage percentage
for that new module.
To allow for code coverage reporting in the CI, we simply need to
install `pytest-cov` so we can use the `--cov` flag with the existing
`pytest` command.
Ideally, it would be nicer to have a bot report code coverage, but this
PR can be a temporary solution.
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
I ran these changes locally:
<img width="1455" alt="Screenshot 2025-03-10 at 10 01 53 AM"
src="https://github.com/user-attachments/assets/dfd765c6-5979-42a3-b899-7713a3f202e6"
/>
PR build to confirm the expected behavior:
<img width="1326" alt="Screenshot 2025-03-10 at 12 47 36 PM"
src="https://github.com/user-attachments/assets/fe94f1e6-fbb5-4e57-9902-197502c50621"
/>
[//]: # (## Documentation)
Signed-off-by: Courtney Pacheco <6019922+courtneypacheco@users.noreply.github.com>
# What does this PR do?
Add a Dependabot configuration file (.github/dependabot.yml) to enable
automated dependency updates for GitHub Actions. This ensures workflows
stay up to date with the latest versions, improving security and
reliability.
Dependabot is configured to:
- Monitor GitHub Actions dependencies.
- Check for updates in the workflow directory
- Run updates on a daily schedule.
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
Refine the existing update-readthedocs.yml workflow to enhance
automation and reliability. Updates include:
- Expanding path triggers to cover all documentation files (docs/**) and
build artifacts.
- Adding steps to set up Python (3.11), install uv, sync dependencies,
and build HTML using make html.
- Ensuring the ReadTheDocs build trigger only runs on
workflow_dispatch events.
These improvements help validate website builds in PRs, preventing
issues before merging.
Signed-off-by: Sébastien Han <seb@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
The CHANGELOG.md was removed in
e6c9f2a485
so this mention is not relevant anymore.
Signed-off-by: Sébastien Han <seb@redhat.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
[Provide a short summary of what this PR does and why. Link to relevant
issues if applicable.]
Updated `uv.lock` to reflect the latest versions of `llama-models`,
`llama-stack`, and `llama-stack-client` (bumped to 0.1.2). This ensures
dependency consistency and avoids potential issues with outdated package
references.
Added `uv-sync` hook from `uv-pre-commit` repository to ensure
synchronization of dependencies.
Signed-off-by: Sébastien Han <seb@redhat.com>
[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])
## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]
[//]: # (## Documentation)
[//]: # (- [ ] Added a Changelog entry if the change is significant)
Signed-off-by: Sébastien Han <seb@redhat.com>
# What does this PR do?
This PR splits the inference tests into text and vision to make testing
on vLLM provider easier as mentioned in
https://github.com/meta-llama/llama-stack/pull/951 since serving
multiple models (e.g. Llama-3.2-11B-Vision-Instruct and
Llama-3.1-8B-Instruct) on a single port using the OpenAI API is [not
supported yet](https://docs.vllm.ai/en/v0.5.5/serving/faq.html) so it's
a bit tricky to test both at the same time.
## Test Plan
All previously passing tests related to text still pass:
`LLAMA_STACK_BASE_URL=http://localhost:5002 pytest -v
tests/client-sdk/inference/test_text_inference.py`
All vision tests passed via `LLAMA_STACK_BASE_URL=http://localhost:5002
pytest -v tests/client-sdk/inference/test_vision_inference.py`.
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
This adds a new workflow to check semantic PR titles to match the
[Conventional Commits spec](https://www.conventionalcommits.org/). This
will make it easier to browse commit history and enable automation in
the future.
---------
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Also, hiding guidance to the author under comments to avoid polluting
the description with ti.
Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
# What does this PR do?
Using `Closes #` syntax in PR template, as per:
https://docs.github.com/en/get-started/writing-on-github/working-with-advanced-formatting/using-keywords-in-issues-and-pull-requests
```
In short, provide a summary of what this PR does and why. Usually, the relevant context should be present in a linked issue.
```
Hides this ^.
```
Please describe:
- tests you ran to verify your changes with result summaries.
- provide instructions so it can be reproduced.
```
And this ^.
```
Please link relevant resources if necessary.
```
And this ^.
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
Signed-off-by: Ihar Hrachyshka <ihar.hrachyshka@gmail.com>
# What does this PR do?
- Added a checklist item in the PR template to ensure significant
changes are documented in the changelog.
- Updated `CHANGELOG.md` with a placeholder for version `0.2.0`.
- This is an effort to resurrect the consistent usage of the changelog
file.
Signed-off-by: Sébastien Han <seb@redhat.com>
## Test Plan
Please describe:
- tests you ran to verify your changes with result summaries.
- provide instructions so it can be reproduced.
## Sources
Please link relevant resources if necessary.
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
Signed-off-by: Sébastien Han <seb@redhat.com>
This is similar to what we are doing for other projects, e.g.
https://github.com/argoproj/argo-workflows/tree/main/.github/ISSUE_TEMPLATE
The benefits is to give people more options before submitting a bug
report or feature request on GitHub.
---------
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
## What does this PR do?
See issue: #747 -- `uv` is just plain better. This PR does the bare
minimum of replacing `pip install` by `uv pip install` and ensuring `uv`
exists in the environment.
## Test Plan
First: create new conda, `uv pip install -e .` on `llama-stack` -- all
is good.
Next: run `llama stack build --template together` followed by `llama
stack run together` -- all good
Next: run `llama stack build --template together --image-name yoyo`
followed by `llama stack run together --image-name yoyo` -- all good
Next: fresh conda and `uv pip install -e .` and `llama stack build
--template together --image-type venv` -- all good.
Docker: `llama stack build --template together --image-type container`
works!