mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-11 05:38:38 +00:00
feat(tests): implement test isolation for inference recordings (#3681)
Uses test_id in request hashes and test-scoped subdirectories to prevent cross-test contamination. Model list endpoints exclude test_id to enable merging recordings from different servers. Additionally, this PR adds a `record-if-missing` mode (which we will use instead of `record` which records everything) which is very useful. 🤖 Co-authored with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude <noreply@anthropic.com>
This commit is contained in:
parent
f176196fba
commit
045a0c1d57
428 changed files with 85345 additions and 104330 deletions
|
@ -31,18 +31,24 @@ These normalizations ensure that re-recording tests produces minimal git diffs,
|
|||
|
||||
## Usage
|
||||
|
||||
### Recording mode
|
||||
Set `LLAMA_STACK_TEST_INFERENCE_MODE=record` to capture new responses:
|
||||
```bash
|
||||
LLAMA_STACK_TEST_INFERENCE_MODE=record pytest tests/integration/
|
||||
```
|
||||
|
||||
### Replay mode (default)
|
||||
Responses are replayed from recordings:
|
||||
```bash
|
||||
LLAMA_STACK_TEST_INFERENCE_MODE=replay pytest tests/integration/
|
||||
```
|
||||
|
||||
### Record-if-missing mode (recommended for adding new tests)
|
||||
Records only when no recording exists, otherwise replays. Use this for iterative development:
|
||||
```bash
|
||||
LLAMA_STACK_TEST_INFERENCE_MODE=record-if-missing pytest tests/integration/
|
||||
```
|
||||
|
||||
### Recording mode
|
||||
**Force-records all API interactions**, overwriting existing recordings. Use with caution:
|
||||
```bash
|
||||
LLAMA_STACK_TEST_INFERENCE_MODE=record pytest tests/integration/
|
||||
```
|
||||
|
||||
### Live mode
|
||||
Skip recordings entirely and use live APIs:
|
||||
```bash
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue