mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-21 07:38:41 +00:00

History

Sumanth Kamenani bd35aa4d78 feat: enable streaming usage metrics for OpenAI-compatible providers (#4326 ) Inject `stream_options={"include_usage": True} `when streaming and OpenTelemetry telemetry is active. Telemetry always overrides any caller preference to ensure complete and consistent observability metrics. Changes: - Add conditional stream_options injection to OpenAIMixin (benefits OpenAI, Bedrock, Runpod, Together, Fireworks providers) - Add conditional stream_options injection to LiteLLMOpenAIMixin (benefits WatsonX and other litellm-based providers) - Check telemetry status using trace.get_current_span().is_recording() - Override include_usage=False when telemetry active to prevent metric gaps - Unit tests for this functionality Fixes #3981 Note: this work originated in PR #4200, which I closed after rebasing on the telemetry changes. This PR rebases those commits, incorporates the Bedrock feedback, and carries forward the same scope described there. ## Test Plan #### OpenAIMixin + telemetry injection tests PYTHONPATH=src python -m pytest tests/unit/providers/utils/inference/test_openai_mixin.py #### LiteLLM OpenAIMixin tests PYTHONPATH=src python -m pytest tests/unit/providers/inference/test_litellm_openai_mixin.py -v #### Broader inference provider PYTHONPATH=src python -m pytest tests/unit/providers/inference/ --ignore=tests/unit/providers/inference/test_inference_client_caching.py -v		2025-12-19 15:53:53 -08:00
..
cli	feat: remove usage of build yaml (#4192 )	2025-12-10 10:12:12 +01:00
conversations	feat: remove usage of build yaml (#4192 )	2025-12-10 10:12:12 +01:00
core	feat: Enhance Vector Stores config with full configurations (#4397 )	2025-12-17 16:56:46 -05:00
distribution	feat: convert Datasets API to use FastAPI router (#4359 )	2025-12-15 11:23:04 -08:00
files	refactor(storage): make { kvstore, sqlstore } as llama stack "internal" APIs (#4181 )	2025-11-18 13:15:16 -08:00
models	refactor: remove dead inference API code and clean up imports (#4093 )	2025-11-10 15:29:24 -08:00
prompts/prompts	feat: remove usage of build yaml (#4192 )	2025-12-10 10:12:12 +01:00
providers	feat: enable streaming usage metrics for OpenAI-compatible providers (#4326 )	2025-12-19 15:53:53 -08:00
rag	feat: Making static prompt values in Rag/File Search configurable in Vector Store Config (#4368 )	2025-12-15 11:39:01 -05:00
registry	refactor(storage): make { kvstore, sqlstore } as llama stack "internal" APIs (#4181 )	2025-11-18 13:15:16 -08:00
server	fix(server): add middleware for provider data and test context (#4367 )	2025-12-16 15:00:48 -05:00
tools	fix: rename llama_stack_api dir (#4155 )	2025-11-13 15:04:36 -08:00
utils	fix(inference): respect table_name config in InferenceStore (#4371 )	2025-12-11 14:50:23 +01:00
__init__.py	chore: Add fixtures to conftest.py (#2067 )	2025-05-06 13:57:48 +02:00
conftest.py	test: suppress expected error logs in SSE test (#3886 )	2025-10-22 14:34:32 -07:00
fixtures.py	refactor(storage): make { kvstore, sqlstore } as llama stack "internal" APIs (#4181 )	2025-11-18 13:15:16 -08:00
README.md	test: Measure and track code coverage (#2636 )	2025-07-18 18:08:36 +02:00

README.md

Llama Stack Unit Tests

Unit Tests

Unit tests verify individual components and functions in isolation. They are fast, reliable, and don't require external services.

Prerequisites

Python Environment: Ensure you have Python 3.12+ installed
uv Package Manager: Install uv if not already installed

You can run the unit tests by running:

./scripts/unit-tests.sh [PYTEST_ARGS]

Any additional arguments are passed to pytest. For example, you can specify a test directory, a specific test file, or any pytest flags (e.g., -vvv for verbosity). If no test directory is specified, it defaults to "tests/unit", e.g:

./scripts/unit-tests.sh tests/unit/registry/test_registry.py -vvv

If you'd like to run for a non-default version of Python (currently 3.12), pass PYTHON_VERSION variable as follows:

source .venv/bin/activate
PYTHON_VERSION=3.13 ./scripts/unit-tests.sh

Test Configuration

Test Discovery: Tests are automatically discovered in the tests/unit/ directory
Async Support: Tests use --asyncio-mode=auto for automatic async test handling
Coverage: Tests generate coverage reports in htmlcov/ directory
Python Version: Defaults to Python 3.12, but can be overridden with PYTHON_VERSION environment variable

Coverage Reports

After running tests, you can view coverage reports:

# Open HTML coverage report in browser
open htmlcov/index.html  # macOS
xdg-open htmlcov/index.html  # Linux
start htmlcov/index.html  # Windows