mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-04 04:04:14 +00:00
Implements a comprehensive recording and replay system for inference API calls that eliminates dependency on online inference providers during testing. The system treats inference as deterministic by recording real API responses and replaying them in subsequent test runs. Applies to OpenAI clients (which should cover many inference requests) as well as Ollama AsyncClient. For storing, we use a hybrid system: Sqlite for fast lookups and JSON files for easy greppability / debuggability. As expected, tests become much much faster (more than 3x in just inference testing.) ```bash LLAMA_STACK_TEST_INFERENCE_MODE=record LLAMA_STACK_TEST_RECORDING_DIR=<...> \ uv run pytest -s -v tests/integration/inference \ --stack-config=starter \ -k "not( builtin_tool or safety_with_image or code_interpreter or test_rag )" \ --text-model="ollama/llama3.2:3b-instruct-fp16" \ --embedding-model=sentence-transformers/all-MiniLM-L6-v2 ``` ```bash LLAMA_STACK_TEST_INFERENCE_MODE=replay LLAMA_STACK_TEST_RECORDING_DIR=<...> \ uv run pytest -s -v tests/integration/inference \ --stack-config=starter \ -k "not( builtin_tool or safety_with_image or code_interpreter or test_rag )" \ --text-model="ollama/llama3.2:3b-instruct-fp16" \ --embedding-model=sentence-transformers/all-MiniLM-L6-v2 ``` - `LLAMA_STACK_TEST_INFERENCE_MODE`: `live` (default), `record`, or `replay` - `LLAMA_STACK_TEST_RECORDING_DIR`: Storage location (must be specified for record or replay modes) |
||
---|---|---|
.. | ||
cli | ||
distribution | ||
files | ||
models | ||
providers | ||
rag | ||
registry | ||
server | ||
utils | ||
__init__.py | ||
conftest.py | ||
fixtures.py | ||
README.md |
Llama Stack Unit Tests
Unit Tests
Unit tests verify individual components and functions in isolation. They are fast, reliable, and don't require external services.
Prerequisites
- Python Environment: Ensure you have Python 3.12+ installed
- uv Package Manager: Install
uv
if not already installed
You can run the unit tests by running:
./scripts/unit-tests.sh [PYTEST_ARGS]
Any additional arguments are passed to pytest. For example, you can specify a test directory, a specific test file, or any pytest flags (e.g., -vvv for verbosity). If no test directory is specified, it defaults to "tests/unit", e.g:
./scripts/unit-tests.sh tests/unit/registry/test_registry.py -vvv
If you'd like to run for a non-default version of Python (currently 3.12), pass PYTHON_VERSION
variable as follows:
source .venv/bin/activate
PYTHON_VERSION=3.13 ./scripts/unit-tests.sh
Test Configuration
- Test Discovery: Tests are automatically discovered in the
tests/unit/
directory - Async Support: Tests use
--asyncio-mode=auto
for automatic async test handling - Coverage: Tests generate coverage reports in
htmlcov/
directory - Python Version: Defaults to Python 3.12, but can be overridden with
PYTHON_VERSION
environment variable
Coverage Reports
After running tests, you can view coverage reports:
# Open HTML coverage report in browser
open htmlcov/index.html # macOS
xdg-open htmlcov/index.html # Linux
start htmlcov/index.html # Windows