Renames `inference_recorder.py` to `api_recorder.py` and extends it to
support recording/replaying tool invocations in addition to inference
calls.
This allows us to record web-search, etc. tool calls and thereafter
apply recordings for `tests/integration/responses`
## Test Plan
```
export OPENAI_API_KEY=...
export TAVILY_SEARCH_API_KEY=...
./scripts/integration-tests.sh --stack-config ci-tests \
--suite responses --inference-mode record-if-missing
```
Our integration tests need to be 'grouped' because each group often
needs a specific set of models it works with. We separated vision tests
due to this, and we have a separate set of tests which test "Responses"
API.
This PR makes this system a bit more official so it is very easy to
target these groups and apply all testing infrastructure towards all the
groups (for example, record-replay) uniformly.
There are three suites declared:
- base
- vision
- responses
Note that our CI currently runs the "base" and "vision" suites.
You can use the `--suite` option when running pytest (or any of the
testing scripts or workflows.) For example:
```
OLLAMA_URL=http://localhost:11434 \
pytest -s -v tests/integration/ --stack-config starter --suite vision
```
2025-09-05 13:58:49 -07:00
Renamed from tests/integration/non_ci/responses/fixtures/test_cases.py (Browse further)