llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-20 04:38:42 +00:00

History

Akram Ben Aissi 5e74bc7fcf Add dynamic authentication token forwarding support for vLLM provider This enables per-request authentication tokens for vLLM providers, supporting use cases like RAG operations where different requests may need different authentication tokens. The implementation follows the same pattern as other providers like Together AI, Fireworks, and Passthrough. - Add LiteLLMOpenAIMixin that manages the vllm_api_token properly Usage: - Static: VLLM_API_TOKEN env var or config.api_token - Dynamic: X-LlamaStack-Provider-Data header with vllm_api_token All existing functionality is preserved while adding new dynamic capabilities. Signed-off-by: Akram Ben Aissi <akram.benaissi@gmail.com>		2025-09-15 13:01:12 +01:00
..
anthropic	chore: update the anthropic inference impl to use openai-python for openai-compat functions (#3366 )	2025-09-07 14:00:42 -07:00
azure	feat: add Azure OpenAI inference provider support (#3396 )	2025-09-11 13:48:38 +02:00
bedrock	fix: AWS Bedrock inference profile ID conversion for region-specific endpoints (#3386 )	2025-09-11 11:41:53 +02:00
cerebras	feat(starter)!: simplify starter distro; litellm model registry changes (#2916 )	2025-07-25 15:02:04 -07:00
databricks	feat(starter)!: simplify starter distro; litellm model registry changes (#2916 )	2025-07-25 15:02:04 -07:00
fireworks	refactor(logging): rename llama_stack logger categories (#3065 )	2025-08-21 17:31:04 -07:00
gemini	chore: update the gemini inference impl to use openai-python for openai-compat functions (#3351 )	2025-09-06 12:22:20 -07:00
groq	chore: update the groq inference impl to use openai-python for openai-compat functions (#3348 )	2025-09-06 15:36:27 -07:00
llama_openai_compat	chore: indicate to mypy that InferenceProvider.rerank is concrete (#3238 )	2025-08-22 12:02:13 -07:00
nvidia	docs: add VLM NIM example (#3277 )	2025-08-29 16:23:52 -07:00
ollama	feat(tests): auto-merge all model list responses and unify recordings (#3320 )	2025-09-03 11:33:03 -07:00
openai	refactor(logging): rename llama_stack logger categories (#3065 )	2025-08-21 17:31:04 -07:00
passthrough	chore(rename): move llama_stack.distribution to llama_stack.core (#2975 )	2025-07-30 23:30:53 -07:00
runpod	ci: test safety with starter (#2628 )	2025-07-09 16:53:50 +02:00
sambanova	chore: update the sambanova inference impl to use openai-python for openai-compat functions (#3345 )	2025-09-06 12:25:13 -07:00
tgi	refactor(logging): rename llama_stack logger categories (#3065 )	2025-08-21 17:31:04 -07:00
together	refactor(logging): rename llama_stack logger categories (#3065 )	2025-08-21 17:31:04 -07:00
vertexai	ci: Re-enable pre-commit to fail (#3399 )	2025-09-10 10:00:46 -04:00
vllm	Add dynamic authentication token forwarding support for vLLM provider	2025-09-15 13:01:12 +01:00
watsonx	chore(python-deps): replace ibm_watson_machine_learning with ibm_watsonx_ai (#3302 )	2025-09-03 11:33:35 +02:00
__init__.py	`impls` -> `inline`, `adapters` -> `remote` (#381 )	2024-11-06 14:54:05 -08:00