llama-stack-mirror/tests/unit/providers
Akram Ben Aissi 5e74bc7fcf Add dynamic authentication token forwarding support for vLLM provider
This enables per-request authentication tokens for vLLM providers, supporting use cases like RAG operations where different requests may need different authentication tokens. The implementation follows the same pattern as other providers like Together AI, Fireworks, and Passthrough.

- Add LiteLLMOpenAIMixin that manages the vllm_api_token properly

Usage:

- Static: VLLM_API_TOKEN env var or config.api_token
- Dynamic: X-LlamaStack-Provider-Data header with vllm_api_token
All existing functionality is preserved while adding new dynamic capabilities.

Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>
2025-09-15 13:01:12 +01:00
..
agent fix: Fix list_sessions() (#3114) 2025-08-13 07:46:26 -07:00
agents fix: ensure assistant message is followed by tool call message as expected by openai (#3224) 2025-08-22 10:42:03 -07:00
batches feat(batches, completions): add /v1/completions support to /v1/batches (#3309) 2025-09-05 11:59:57 -07:00
files feat(files, s3, expiration): add expires_after support to S3 files provider (#3283) 2025-08-29 16:17:24 -07:00
inference Add dynamic authentication token forwarding support for vLLM provider 2025-09-15 13:01:12 +01:00
nvidia chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
utils chore: Updating documentation, adding exception handling for Vector Stores in RAG Tool, more tests on migration, and migrate off of inference_api for context_retriever for RAG (#3367) 2025-09-11 14:20:11 +02:00
vector_io feat: migrate to FIPS-validated cryptographic algorithms (#3423) 2025-09-12 11:18:19 +02:00
test_bedrock.py fix: AWS Bedrock inference profile ID conversion for region-specific endpoints (#3386) 2025-09-11 11:41:53 +02:00
test_configs.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00