llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-20 06:48:42 +00:00

History

Akram Ben Aissi 5e74bc7fcf Add dynamic authentication token forwarding support for vLLM provider This enables per-request authentication tokens for vLLM providers, supporting use cases like RAG operations where different requests may need different authentication tokens. The implementation follows the same pattern as other providers like Together AI, Fireworks, and Passthrough. - Add LiteLLMOpenAIMixin that manages the vllm_api_token properly Usage: - Static: VLLM_API_TOKEN env var or config.api_token - Dynamic: X-LlamaStack-Provider-Data header with vllm_api_token All existing functionality is preserved while adding new dynamic capabilities. Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>		2025-09-15 13:01:12 +01:00
..
agents	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
datasetio	chore(misc): make tests and starter faster (#3042 )	2025-08-05 14:55:05 -07:00
eval	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
files/s3	feat(files, s3, expiration): add expires_after support to S3 files provider (#3283 )	2025-08-29 16:17:24 -07:00
inference	Add dynamic authentication token forwarding support for vLLM provider	2025-09-15 13:01:12 +01:00
post_training	refactor(logging): rename llama_stack logger categories (#3065 )	2025-08-21 17:31:04 -07:00
safety	refactor(logging): rename llama_stack logger categories (#3065 )	2025-08-21 17:31:04 -07:00
tool_runtime	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
vector_io	feat: implement keyword, vector and hybrid search inside vector stores for PGVector provider (#3064 )	2025-08-29 16:30:12 +02:00
__init__.py	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00