mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-04 04:04:14 +00:00
# What does this PR do? *Add dynamic authentication token forwarding support for vLLM provider* This enables per-request authentication tokens for vLLM providers, supporting use cases like RAG operations where different requests may need different authentication tokens. The implementation follows the same pattern as other providers like Together AI, Fireworks, and Passthrough. - Add LiteLLMOpenAIMixin that manages the vllm_api_token properly Usage: - Static: VLLM_API_TOKEN env var or config.api_token - Dynamic: X-LlamaStack-Provider-Data header with vllm_api_token All existing functionality is preserved while adding new dynamic capabilities. <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> ``` curl -X POST "http://localhost:8000/v1/chat/completions" -H "Authorization: Bearer my-dynamic-token" \ -H "X-LlamaStack-Provider-Data: {\"vllm_api_token\": \"Bearer my-dynamic-token\", \"vllm_url\": \"http://dynamic-server:8000\"}" \ -H "Content-Type: application/json" \ -d '{"model": "llama-3.1-8b", "messages": [{"role": "user", "content": "Hello!"}]}' ``` --------- Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com> |
||
---|---|---|
.. | ||
cli | ||
distribution | ||
files | ||
models | ||
prompts/prompts | ||
providers | ||
rag | ||
registry | ||
server | ||
utils | ||
__init__.py | ||
conftest.py | ||
fixtures.py | ||
README.md |
Llama Stack Unit Tests
Unit Tests
Unit tests verify individual components and functions in isolation. They are fast, reliable, and don't require external services.
Prerequisites
- Python Environment: Ensure you have Python 3.12+ installed
- uv Package Manager: Install
uv
if not already installed
You can run the unit tests by running:
./scripts/unit-tests.sh [PYTEST_ARGS]
Any additional arguments are passed to pytest. For example, you can specify a test directory, a specific test file, or any pytest flags (e.g., -vvv for verbosity). If no test directory is specified, it defaults to "tests/unit", e.g:
./scripts/unit-tests.sh tests/unit/registry/test_registry.py -vvv
If you'd like to run for a non-default version of Python (currently 3.12), pass PYTHON_VERSION
variable as follows:
source .venv/bin/activate
PYTHON_VERSION=3.13 ./scripts/unit-tests.sh
Test Configuration
- Test Discovery: Tests are automatically discovered in the
tests/unit/
directory - Async Support: Tests use
--asyncio-mode=auto
for automatic async test handling - Coverage: Tests generate coverage reports in
htmlcov/
directory - Python Version: Defaults to Python 3.12, but can be overridden with
PYTHON_VERSION
environment variable
Coverage Reports
After running tests, you can view coverage reports:
# Open HTML coverage report in browser
open htmlcov/index.html # macOS
xdg-open htmlcov/index.html # Linux
start htmlcov/index.html # Windows