ci: integrate vLLM inference tests with GitHub Actions workflows

Add vLLM provider support to integration test CI workflows alongside
existing Ollama support. Configure provider-specific test execution
where vLLM runs only inference specific tests (excluding vision tests) while
Ollama continues to run the full test suite.

This enables comprehensive CI testing of both inference providers but
keeps the vLLM footprint small, this can be expanded later if it proves
to not be too disruptive.

Also updated test skips that were marked with "inline::vllm", this
should be "remote::vllm". This causes some failing log probs tests
to be skipped and should be revisted.

Signed-off-by: Derek Higgins <derekh@redhat.com>
This commit is contained in:
Derek Higgins 2025-08-13 14:19:52 +01:00
parent 8075ca77ab
commit 0a6d122623
4 changed files with 11 additions and 3 deletions

View file

@ -72,7 +72,8 @@ runs:
echo "New recordings detected, committing and pushing"
git add tests/integration/
git commit -m "Recordings update from CI (suite: ${{ inputs.suite }})"
git commit -m "Recordings update from CI (setup: ${{ inputs.setup }}, suite: ${{ inputs.suite }})"
git fetch origin ${{ github.ref_name }}
git rebase origin/${{ github.ref_name }}
echo "Rebased successfully"
@ -88,6 +89,8 @@ runs:
run: |
# Ollama logs (if ollama container exists)
sudo docker logs ollama > ollama-${{ inputs.inference-mode }}.log 2>&1 || true
# vllm logs (if vllm container exists)
sudo docker logs vllm > vllm-${{ inputs.inference-mode }}.log 2>&1 || true
# Note: distro container logs are now dumped in integration-tests.sh before container is removed
- name: Upload logs

View file

@ -27,7 +27,6 @@ on:
schedule:
# If changing the cron schedule, update the provider in the test-matrix job
- cron: '0 0 * * *' # (test latest client) Daily at 12 AM UTC
- cron: '1 0 * * 0' # (test vllm) Weekly on Sunday at 1 AM UTC
workflow_dispatch:
inputs:
test-all-client-versions:

View file

@ -2,7 +2,8 @@
"default": [
{"suite": "base", "setup": "ollama"},
{"suite": "vision", "setup": "ollama-vision"},
{"suite": "responses", "setup": "gpt"}
{"suite": "responses", "setup": "gpt"},
{"suite": "base-vllm-subset", "setup": "vllm"}
],
"schedules": {
"1 0 * * 0": [

View file

@ -169,6 +169,11 @@ SUITE_DEFINITIONS: dict[str, Suite] = {
roots=base_roots,
default_setup="ollama",
),
"base-vllm-subset": Suite(
name="base-vllm-subset",
roots=["tests/integration/inference"],
default_setup="vllm",
),
"responses": Suite(
name="responses",
roots=["tests/integration/responses"],