llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-04 04:04:14 +00:00

History

Akram Ben Aissi 4842145202 feat: Add dynamic authentication token forwarding support for vLLM (#3388 ) # What does this PR do? Add dynamic authentication token forwarding support for vLLM provider This enables per-request authentication tokens for vLLM providers, supporting use cases like RAG operations where different requests may need different authentication tokens. The implementation follows the same pattern as other providers like Together AI, Fireworks, and Passthrough. - Add LiteLLMOpenAIMixin that manages the vllm_api_token properly Usage: - Static: VLLM_API_TOKEN env var or config.api_token - Dynamic: X-LlamaStack-Provider-Data header with vllm_api_token All existing functionality is preserved while adding new dynamic capabilities. <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. Provide clear instructions so the plan can be easily re-executed. --> ``` curl -X POST "http://localhost:8000/v1/chat/completions" -H "Authorization: Bearer my-dynamic-token" \ -H "X-LlamaStack-Provider-Data: {\"vllm_api_token\": \"Bearer my-dynamic-token\", \"vllm_url\": \"http://dynamic-server:8000\"}" \ -H "Content-Type: application/json" \ -d '{"model": "llama-3.1-8b", "messages": [{"role": "user", "content": "Hello!"}]}' ``` --------- Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>		2025-09-18 11:13:55 +02:00
..
__init__.py	API Updates (#73 )	2024-09-17 19:51:35 -07:00
agents.py	fix: only load mcp when enabled in tool_group (#2621 )	2025-07-04 20:27:05 +05:30
batches.py	chore: remove openai dependency from providers (#3398 )	2025-09-11 10:19:59 +02:00
datasetio.py	fix(deps): bump datasets versions for all providers (#3382 )	2025-09-08 15:13:42 -07:00
eval.py	docs: auto generated documentation for providers (#2543 )	2025-06-30 15:13:20 +02:00
files.py	feat: Add S3 Files Provider (#3202 )	2025-08-22 10:38:59 -04:00
inference.py	feat: Add dynamic authentication token forwarding support for vLLM (#3388 )	2025-09-18 11:13:55 +02:00
post_training.py	fix(deps): bump datasets versions for all providers (#3382 )	2025-09-08 15:13:42 -07:00
safety.py	docs: auto generated documentation for providers (#2543 )	2025-06-30 15:13:20 +02:00
scoring.py	chore: remove openai dependency from providers (#3398 )	2025-09-11 10:19:59 +02:00
telemetry.py	docs: auto generated documentation for providers (#2543 )	2025-06-30 15:13:20 +02:00
tool_runtime.py	feat: Updating Rag Tool to use Files API and Vector Stores API (#3344 )	2025-09-06 07:26:34 -06:00
vector_io.py	feat: implement keyword, vector and hybrid search inside vector stores for PGVector provider (#3064 )	2025-08-29 16:30:12 +02:00