feat(tests): introduce a test "suite" concept to encompass dirs, options (#3339)
Some checks failed
Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 0s
Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped
Python Package Build Test / build (3.13) (push) Failing after 1s
SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 4s
Integration Tests (Replay) / Integration Tests (, , , client=, ) (push) Failing after 4s
Vector IO Integration Tests / test-matrix (push) Failing after 4s
Python Package Build Test / build (3.12) (push) Failing after 3s
Test External API and Providers / test-external (venv) (push) Failing after 4s
Unit Tests / unit-tests (3.12) (push) Failing after 4s
Unit Tests / unit-tests (3.13) (push) Failing after 3s
UI Tests / ui-tests (22) (push) Successful in 33s
Pre-commit / pre-commit (push) Successful in 1m15s

Our integration tests need to be 'grouped' because each group often
needs a specific set of models it works with. We separated vision tests
due to this, and we have a separate set of tests which test "Responses"
API.

This PR makes this system a bit more official so it is very easy to
target these groups and apply all testing infrastructure towards all the
groups (for example, record-replay) uniformly.

There are three suites declared:
- base
- vision
- responses

Note that our CI currently runs the "base" and "vision" suites.

You can use the `--suite` option when running pytest (or any of the
testing scripts or workflows.) For example:
```
OLLAMA_URL=http://localhost:11434 \
  pytest -s -v tests/integration/ --stack-config starter --suite vision
```
This commit is contained in:
Ashwin Bharambe 2025-09-05 13:58:49 -07:00 committed by GitHub
parent 0c2757a05b
commit 47b640370e
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
25 changed files with 255 additions and 161 deletions

View file

@ -2,13 +2,6 @@ name: 'Run and Record Tests'
description: 'Run integration tests and handle recording/artifact upload'
inputs:
test-subdirs:
description: 'Comma-separated list of test subdirectories to run'
required: true
test-pattern:
description: 'Regex pattern to pass to pytest -k'
required: false
default: ''
stack-config:
description: 'Stack configuration to use'
required: true
@ -18,10 +11,18 @@ inputs:
inference-mode:
description: 'Inference mode (record or replay)'
required: true
run-vision-tests:
description: 'Whether to run vision tests'
test-suite:
description: 'Test suite to use: base, responses, vision, etc.'
required: false
default: 'false'
default: ''
test-subdirs:
description: 'Comma-separated list of test subdirectories to run; overrides test-suite'
required: false
default: ''
test-pattern:
description: 'Regex pattern to pass to pytest -k'
required: false
default: ''
runs:
using: 'composite'
@ -42,7 +43,7 @@ runs:
--test-subdirs '${{ inputs.test-subdirs }}' \
--test-pattern '${{ inputs.test-pattern }}' \
--inference-mode '${{ inputs.inference-mode }}' \
${{ inputs.run-vision-tests == 'true' && '--run-vision-tests' || '' }} \
--test-suite '${{ inputs.test-suite }}' \
| tee pytest-${{ inputs.inference-mode }}.log
@ -57,12 +58,7 @@ runs:
echo "New recordings detected, committing and pushing"
git add tests/integration/recordings/
if [ "${{ inputs.run-vision-tests }}" == "true" ]; then
git commit -m "Recordings update from CI (vision)"
else
git commit -m "Recordings update from CI"
fi
git commit -m "Recordings update from CI (test-suite: ${{ inputs.test-suite }})"
git fetch origin ${{ github.ref_name }}
git rebase origin/${{ github.ref_name }}
echo "Rebased successfully"

View file

@ -1,17 +1,17 @@
name: Setup Ollama
description: Start Ollama
inputs:
run-vision-tests:
description: 'Run vision tests: "true" or "false"'
test-suite:
description: 'Test suite to use: base, responses, vision, etc.'
required: false
default: 'false'
default: ''
runs:
using: "composite"
steps:
- name: Start Ollama
shell: bash
run: |
if [ "${{ inputs.run-vision-tests }}" == "true" ]; then
if [ "${{ inputs.test-suite }}" == "vision" ]; then
image="ollama-with-vision-model"
else
image="ollama-with-models"

View file

@ -12,10 +12,10 @@ inputs:
description: 'Provider to setup (ollama or vllm)'
required: true
default: 'ollama'
run-vision-tests:
description: 'Whether to setup provider for vision tests'
test-suite:
description: 'Test suite to use: base, responses, vision, etc.'
required: false
default: 'false'
default: ''
inference-mode:
description: 'Inference mode (record or replay)'
required: true
@ -33,7 +33,7 @@ runs:
if: ${{ inputs.provider == 'ollama' && inputs.inference-mode == 'record' }}
uses: ./.github/actions/setup-ollama
with:
run-vision-tests: ${{ inputs.run-vision-tests }}
test-suite: ${{ inputs.test-suite }}
- name: Setup vllm
if: ${{ inputs.provider == 'vllm' && inputs.inference-mode == 'record' }}

View file

@ -8,7 +8,7 @@ Llama Stack uses GitHub Actions for Continuous Integration (CI). Below is a tabl
| Installer CI | [install-script-ci.yml](install-script-ci.yml) | Test the installation script |
| Integration Auth Tests | [integration-auth-tests.yml](integration-auth-tests.yml) | Run the integration test suite with Kubernetes authentication |
| SqlStore Integration Tests | [integration-sql-store-tests.yml](integration-sql-store-tests.yml) | Run the integration test suite with SqlStore |
| Integration Tests (Replay) | [integration-tests.yml](integration-tests.yml) | Run the integration test suite from tests/integration in replay mode |
| Integration Tests (Replay) | [integration-tests.yml](integration-tests.yml) | Run the integration test suites from tests/integration in replay mode |
| Vector IO Integration Tests | [integration-vector-io-tests.yml](integration-vector-io-tests.yml) | Run the integration test suite with various VectorIO providers |
| Pre-commit | [pre-commit.yml](pre-commit.yml) | Run pre-commit checks |
| Test Llama Stack Build | [providers-build.yml](providers-build.yml) | Test llama stack build |

View file

@ -1,6 +1,6 @@
name: Integration Tests (Replay)
run-name: Run the integration test suite from tests/integration in replay mode
run-name: Run the integration test suites from tests/integration in replay mode
on:
push:
@ -32,14 +32,6 @@ on:
description: 'Test against a specific provider'
type: string
default: 'ollama'
test-subdirs:
description: 'Comma-separated list of test subdirectories to run'
type: string
default: ''
test-pattern:
description: 'Regex pattern to pass to pytest -k'
type: string
default: ''
concurrency:
# Skip concurrency for pushes to main - each commit should be tested independently
@ -50,7 +42,7 @@ jobs:
run-replay-mode-tests:
runs-on: ubuntu-latest
name: ${{ format('Integration Tests ({0}, {1}, {2}, client={3}, vision={4})', matrix.client-type, matrix.provider, matrix.python-version, matrix.client-version, matrix.run-vision-tests) }}
name: ${{ format('Integration Tests ({0}, {1}, {2}, client={3}, {4})', matrix.client-type, matrix.provider, matrix.python-version, matrix.client-version, matrix.test-suite) }}
strategy:
fail-fast: false
@ -61,7 +53,7 @@ jobs:
# Use Python 3.13 only on nightly schedule (daily latest client test), otherwise use 3.12
python-version: ${{ github.event.schedule == '0 0 * * *' && fromJSON('["3.12", "3.13"]') || fromJSON('["3.12"]') }}
client-version: ${{ (github.event.schedule == '0 0 * * *' || github.event.inputs.test-all-client-versions == 'true') && fromJSON('["published", "latest"]') || fromJSON('["latest"]') }}
run-vision-tests: [true, false]
test-suite: [base, vision]
steps:
- name: Checkout repository
@ -73,15 +65,13 @@ jobs:
python-version: ${{ matrix.python-version }}
client-version: ${{ matrix.client-version }}
provider: ${{ matrix.provider }}
run-vision-tests: ${{ matrix.run-vision-tests }}
test-suite: ${{ matrix.test-suite }}
inference-mode: 'replay'
- name: Run tests
uses: ./.github/actions/run-and-record-tests
with:
test-subdirs: ${{ inputs.test-subdirs }}
test-pattern: ${{ inputs.test-pattern }}
stack-config: ${{ matrix.client-type == 'library' && 'ci-tests' || 'server:ci-tests' }}
provider: ${{ matrix.provider }}
inference-mode: 'replay'
run-vision-tests: ${{ matrix.run-vision-tests }}
test-suite: ${{ matrix.test-suite }}

View file

@ -10,18 +10,18 @@ run-name: Run the integration test suite from tests/integration
on:
workflow_dispatch:
inputs:
test-subdirs:
description: 'Comma-separated list of test subdirectories to run'
type: string
default: ''
test-provider:
description: 'Test against a specific provider'
type: string
default: 'ollama'
run-vision-tests:
description: 'Whether to run vision tests'
type: boolean
default: false
test-suite:
description: 'Test suite to use: base, responses, vision, etc.'
type: string
default: ''
test-subdirs:
description: 'Comma-separated list of test subdirectories to run; overrides test-suite'
type: string
default: ''
test-pattern:
description: 'Regex pattern to pass to pytest -k'
type: string
@ -38,11 +38,11 @@ jobs:
- name: Echo workflow inputs
run: |
echo "::group::Workflow Inputs"
echo "test-subdirs: ${{ inputs.test-subdirs }}"
echo "test-provider: ${{ inputs.test-provider }}"
echo "run-vision-tests: ${{ inputs.run-vision-tests }}"
echo "test-pattern: ${{ inputs.test-pattern }}"
echo "branch: ${{ github.ref_name }}"
echo "test-provider: ${{ inputs.test-provider }}"
echo "test-suite: ${{ inputs.test-suite }}"
echo "test-subdirs: ${{ inputs.test-subdirs }}"
echo "test-pattern: ${{ inputs.test-pattern }}"
echo "::endgroup::"
- name: Checkout repository
@ -56,15 +56,15 @@ jobs:
python-version: "3.12" # Use single Python version for recording
client-version: "latest"
provider: ${{ inputs.test-provider || 'ollama' }}
run-vision-tests: ${{ inputs.run-vision-tests }}
test-suite: ${{ inputs.test-suite }}
inference-mode: 'record'
- name: Run and record tests
uses: ./.github/actions/run-and-record-tests
with:
test-pattern: ${{ inputs.test-pattern }}
test-subdirs: ${{ inputs.test-subdirs }}
stack-config: 'server:ci-tests' # recording must be done with server since more tests are run
provider: ${{ inputs.test-provider || 'ollama' }}
inference-mode: 'record'
run-vision-tests: ${{ inputs.run-vision-tests }}
test-suite: ${{ inputs.test-suite }}
test-subdirs: ${{ inputs.test-subdirs }}
test-pattern: ${{ inputs.test-pattern }}