Add vLLM provider support to integration test CI workflows alongside
existing Ollama support. Configure provider-specific test execution
where vLLM runs only inference specific tests (excluding vision tests) while
Ollama continues to run the full test suite.
This enables comprehensive CI testing of both inference providers but
keeps the vLLM footprint small, this can be expanded later if it proves
to not be too disruptive.
Also updated test skips that were marked with "inline::vllm", this
should be "remote::vllm". This causes some failing log probs tests
to be skipped and should be revisted.
Signed-off-by: Derek Higgins <derekh@redhat.com>
Added a script to cleanup recordings. While doing this, moved the CI
matrix generation to a separate script so there is a single source of
truth for the matrix.
Ran the cleanup script as:
```
PYTHONPATH=. python scripts/cleanup_recordings.py
```
Also added this as part of the pre-commit workflow to ensure that the
recordings are always up to date and that no stale recordings are left
in the repo.