llama-stack-mirror/tests/unit
Akram Ben Aissi 4842145202
feat: Add dynamic authentication token forwarding support for vLLM (#3388)
# What does this PR do?


*Add dynamic authentication token forwarding support for vLLM provider*

This enables per-request authentication tokens for vLLM providers,
supporting use cases like RAG operations where different requests may
need different authentication tokens. The implementation follows the
same pattern as other providers like Together AI, Fireworks, and
Passthrough.

- Add LiteLLMOpenAIMixin that manages the vllm_api_token properly

Usage:

- Static: VLLM_API_TOKEN env var or config.api_token
- Dynamic: X-LlamaStack-Provider-Data header with vllm_api_token
All existing functionality is preserved while adding new dynamic
capabilities.


<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

```
curl -X POST "http://localhost:8000/v1/chat/completions" -H "Authorization: Bearer my-dynamic-token" \
  -H "X-LlamaStack-Provider-Data: {\"vllm_api_token\": \"Bearer my-dynamic-token\", \"vllm_url\": \"http://dynamic-server:8000\"}" \
  -H "Content-Type: application/json" \
  -d '{"model": "llama-3.1-8b", "messages": [{"role": "user", "content": "Hello!"}]}'
  
```

---------

Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>
2025-09-18 11:13:55 +02:00
..
cli chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
distribution fix: Fixing prompts import warning (#3455) 2025-09-17 10:24:58 +02:00
files chore(files tests): update files integration tests and fix inline::localfs (#3195) 2025-08-20 14:22:40 -04:00
models chore(test): migrate unit tests from unittest to pytest for system prompt (#2789) 2025-07-18 11:54:02 +02:00
prompts/prompts feat: Adding OpenAI Prompts API (#3319) 2025-09-08 11:05:13 -04:00
providers feat: Add dynamic authentication token forwarding support for vLLM (#3388) 2025-09-18 11:13:55 +02:00
rag fix: pre-commit issues: non executable shebang file and removal of @pytest.mark.asyncio decorator (#3397) 2025-09-10 15:27:35 +02:00
registry fix: Added a bug fix when registering new models (#3453) 2025-09-16 19:09:06 -07:00
server feat: Add Kubernetes auth provider to use SelfSubjectReview and kubernetes api server (#2559) 2025-09-08 11:25:10 +02:00
utils chore: introduce write queue for inference_store (#3383) 2025-09-10 11:57:42 -07:00
__init__.py chore: Add fixtures to conftest.py (#2067) 2025-05-06 13:57:48 +02:00
conftest.py chore: block network access from unit tests (#2732) 2025-07-12 16:53:54 -07:00
fixtures.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
README.md test: Measure and track code coverage (#2636) 2025-07-18 18:08:36 +02:00

Llama Stack Unit Tests

Unit Tests

Unit tests verify individual components and functions in isolation. They are fast, reliable, and don't require external services.

Prerequisites

  1. Python Environment: Ensure you have Python 3.12+ installed
  2. uv Package Manager: Install uv if not already installed

You can run the unit tests by running:

./scripts/unit-tests.sh [PYTEST_ARGS]

Any additional arguments are passed to pytest. For example, you can specify a test directory, a specific test file, or any pytest flags (e.g., -vvv for verbosity). If no test directory is specified, it defaults to "tests/unit", e.g:

./scripts/unit-tests.sh tests/unit/registry/test_registry.py -vvv

If you'd like to run for a non-default version of Python (currently 3.12), pass PYTHON_VERSION variable as follows:

source .venv/bin/activate
PYTHON_VERSION=3.13 ./scripts/unit-tests.sh

Test Configuration

  • Test Discovery: Tests are automatically discovered in the tests/unit/ directory
  • Async Support: Tests use --asyncio-mode=auto for automatic async test handling
  • Coverage: Tests generate coverage reports in htmlcov/ directory
  • Python Version: Defaults to Python 3.12, but can be overridden with PYTHON_VERSION environment variable

Coverage Reports

After running tests, you can view coverage reports:

# Open HTML coverage report in browser
open htmlcov/index.html  # macOS
xdg-open htmlcov/index.html  # Linux
start htmlcov/index.html  # Windows