llama-stack-mirror/tests/unit/providers/utils/inference
Sumanth Kamenani bd35aa4d78
feat: enable streaming usage metrics for OpenAI-compatible providers (#4326)
Inject `stream_options={"include_usage": True} `when streaming and
OpenTelemetry telemetry is active. Telemetry always overrides any caller
preference to ensure complete and consistent observability metrics.

Changes:
- Add conditional stream_options injection to OpenAIMixin (benefits
OpenAI, Bedrock, Runpod, Together, Fireworks providers)
- Add conditional stream_options injection to LiteLLMOpenAIMixin
(benefits WatsonX and other litellm-based providers)
- Check telemetry status using trace.get_current_span().is_recording()
- Override include_usage=False when telemetry active to prevent metric
gaps
- Unit tests for this functionality

Fixes #3981

Note: this work originated in PR #4200, which I closed after rebasing on
the telemetry changes. This PR rebases those commits, incorporates the
Bedrock feedback, and carries forward the same scope described there.
## Test Plan
#### OpenAIMixin + telemetry injection tests 
PYTHONPATH=src python -m pytest
tests/unit/providers/utils/inference/test_openai_mixin.py

#### LiteLLM OpenAIMixin tests
PYTHONPATH=src python -m pytest
tests/unit/providers/inference/test_litellm_openai_mixin.py -v

#### Broader inference provider
PYTHONPATH=src python -m pytest tests/unit/providers/inference/
--ignore=tests/unit/providers/inference/test_inference_client_caching.py
-v
2025-12-19 15:53:53 -08:00
..
test_openai_compat.py feat: enable streaming usage metrics for OpenAI-compatible providers (#4326) 2025-12-19 15:53:53 -08:00
test_openai_mixin.py feat: enable streaming usage metrics for OpenAI-compatible providers (#4326) 2025-12-19 15:53:53 -08:00
test_prompt_adapter.py fix: rename llama_stack_api dir (#4155) 2025-11-13 15:04:36 -08:00
test_remote_inference_provider_config.py feat: use SecretStr for inference provider auth credentials (#3724) 2025-10-10 07:32:50 -07:00