llama-stack-mirror/llama_stack/providers/remote/inference
2025-07-21 05:17:29 -04:00
..
anthropic ci: test safety with starter (#2628) 2025-07-09 16:53:50 +02:00
bedrock ci: test safety with starter (#2628) 2025-07-09 16:53:50 +02:00
cerebras ci: test safety with starter (#2628) 2025-07-09 16:53:50 +02:00
cerebras_openai_compat feat: introduce APIs for retrieving chat completion requests (#2145) 2025-05-18 21:43:19 -07:00
databricks ci: test safety with starter (#2628) 2025-07-09 16:53:50 +02:00
fireworks ci: test safety with starter (#2628) 2025-07-09 16:53:50 +02:00
fireworks_openai_compat feat: introduce APIs for retrieving chat completion requests (#2145) 2025-05-18 21:43:19 -07:00
gemini ci: test safety with starter (#2628) 2025-07-09 16:53:50 +02:00
groq fix: Don't cache clients for passthrough auth providers (#2728) 2025-07-11 13:38:27 -07:00
groq_openai_compat feat: introduce APIs for retrieving chat completion requests (#2145) 2025-05-18 21:43:19 -07:00
llama_openai_compat feat: create dynamic model registration for OpenAI and Llama compat remote inference providers (#2745) 2025-07-16 12:49:38 -04:00
nvidia feat: allow dynamic model registration for nvidia inference provider (#2726) 2025-07-17 12:11:30 -07:00
ollama Merge branch 'main' into allow-dynamic-models-ollama 2025-07-21 05:17:29 -04:00
openai feat: create dynamic model registration for OpenAI and Llama compat remote inference providers (#2745) 2025-07-16 12:49:38 -04:00
passthrough feat: consolidate most distros into "starter" (#2516) 2025-07-04 15:58:03 +02:00
runpod ci: test safety with starter (#2628) 2025-07-09 16:53:50 +02:00
sambanova fix: sambanova shields and model validation (#2693) 2025-07-11 16:29:15 -04:00
sambanova_openai_compat feat: introduce APIs for retrieving chat completion requests (#2145) 2025-05-18 21:43:19 -07:00
tgi feat: consolidate most distros into "starter" (#2516) 2025-07-04 15:58:03 +02:00
together fix: Don't cache clients for passthrough auth providers (#2728) 2025-07-11 13:38:27 -07:00
together_openai_compat feat: introduce APIs for retrieving chat completion requests (#2145) 2025-05-18 21:43:19 -07:00
vllm feat(vllm): periodically refresh models (#2823) 2025-07-18 15:53:09 -07:00
watsonx fix: allow default empty vars for conditionals (#2570) 2025-07-01 14:42:05 +02:00
__init__.py impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00