llama-stack-mirror/tests/unit
Sumanth Kamenani bd35aa4d78
feat: enable streaming usage metrics for OpenAI-compatible providers (#4326)
Inject `stream_options={"include_usage": True} `when streaming and
OpenTelemetry telemetry is active. Telemetry always overrides any caller
preference to ensure complete and consistent observability metrics.

Changes:
- Add conditional stream_options injection to OpenAIMixin (benefits
OpenAI, Bedrock, Runpod, Together, Fireworks providers)
- Add conditional stream_options injection to LiteLLMOpenAIMixin
(benefits WatsonX and other litellm-based providers)
- Check telemetry status using trace.get_current_span().is_recording()
- Override include_usage=False when telemetry active to prevent metric
gaps
- Unit tests for this functionality

Fixes #3981

Note: this work originated in PR #4200, which I closed after rebasing on
the telemetry changes. This PR rebases those commits, incorporates the
Bedrock feedback, and carries forward the same scope described there.
## Test Plan
#### OpenAIMixin + telemetry injection tests 
PYTHONPATH=src python -m pytest
tests/unit/providers/utils/inference/test_openai_mixin.py

#### LiteLLM OpenAIMixin tests
PYTHONPATH=src python -m pytest
tests/unit/providers/inference/test_litellm_openai_mixin.py -v

#### Broader inference provider
PYTHONPATH=src python -m pytest tests/unit/providers/inference/
--ignore=tests/unit/providers/inference/test_inference_client_caching.py
-v
2025-12-19 15:53:53 -08:00
..
cli feat: remove usage of build yaml (#4192) 2025-12-10 10:12:12 +01:00
conversations feat: remove usage of build yaml (#4192) 2025-12-10 10:12:12 +01:00
core feat: Enhance Vector Stores config with full configurations (#4397) 2025-12-17 16:56:46 -05:00
distribution feat: convert Datasets API to use FastAPI router (#4359) 2025-12-15 11:23:04 -08:00
files refactor(storage): make { kvstore, sqlstore } as llama stack "internal" APIs (#4181) 2025-11-18 13:15:16 -08:00
models refactor: remove dead inference API code and clean up imports (#4093) 2025-11-10 15:29:24 -08:00
prompts/prompts feat: remove usage of build yaml (#4192) 2025-12-10 10:12:12 +01:00
providers feat: enable streaming usage metrics for OpenAI-compatible providers (#4326) 2025-12-19 15:53:53 -08:00
rag feat: Making static prompt values in Rag/File Search configurable in Vector Store Config (#4368) 2025-12-15 11:39:01 -05:00
registry refactor(storage): make { kvstore, sqlstore } as llama stack "internal" APIs (#4181) 2025-11-18 13:15:16 -08:00
server fix(server): add middleware for provider data and test context (#4367) 2025-12-16 15:00:48 -05:00
tools fix: rename llama_stack_api dir (#4155) 2025-11-13 15:04:36 -08:00
utils fix(inference): respect table_name config in InferenceStore (#4371) 2025-12-11 14:50:23 +01:00
__init__.py chore: Add fixtures to conftest.py (#2067) 2025-05-06 13:57:48 +02:00
conftest.py test: suppress expected error logs in SSE test (#3886) 2025-10-22 14:34:32 -07:00
fixtures.py refactor(storage): make { kvstore, sqlstore } as llama stack "internal" APIs (#4181) 2025-11-18 13:15:16 -08:00
README.md test: Measure and track code coverage (#2636) 2025-07-18 18:08:36 +02:00

Llama Stack Unit Tests

Unit Tests

Unit tests verify individual components and functions in isolation. They are fast, reliable, and don't require external services.

Prerequisites

  1. Python Environment: Ensure you have Python 3.12+ installed
  2. uv Package Manager: Install uv if not already installed

You can run the unit tests by running:

./scripts/unit-tests.sh [PYTEST_ARGS]

Any additional arguments are passed to pytest. For example, you can specify a test directory, a specific test file, or any pytest flags (e.g., -vvv for verbosity). If no test directory is specified, it defaults to "tests/unit", e.g:

./scripts/unit-tests.sh tests/unit/registry/test_registry.py -vvv

If you'd like to run for a non-default version of Python (currently 3.12), pass PYTHON_VERSION variable as follows:

source .venv/bin/activate
PYTHON_VERSION=3.13 ./scripts/unit-tests.sh

Test Configuration

  • Test Discovery: Tests are automatically discovered in the tests/unit/ directory
  • Async Support: Tests use --asyncio-mode=auto for automatic async test handling
  • Coverage: Tests generate coverage reports in htmlcov/ directory
  • Python Version: Defaults to Python 3.12, but can be overridden with PYTHON_VERSION environment variable

Coverage Reports

After running tests, you can view coverage reports:

# Open HTML coverage report in browser
open htmlcov/index.html  # macOS
xdg-open htmlcov/index.html  # Linux
start htmlcov/index.html  # Windows