Add a local-vllm template

This is just like `local` using `meta-reference` for everything except it uses `vllm` for inference. Docker works, but So far, `conda` is a bit easier to use with the vllm provider. The default container base image does not include all the necessary libraries for all vllm features. More cuda dependencies are necessary. I started changing this base image used in this template, but it also required changes to the Dockerfile, so it was getting too involved to include in the first PR. Signed-off-by: Russell Bryant <rbryant@redhat.com>
2025-07-29 07:14:20 +00:00 · 2024-09-28 19:10:04 +00:00 · 2024-09-28 19:10:04 +00:00 · 08da5d003a
commit 08da5d003a
parent 31a0c51dea
1 changed files with 10 additions and 0 deletions
--- a/llama_stack/distribution/templates/local-vllm-build.yaml
+++ b/llama_stack/distribution/templates/local-vllm-build.yaml
@ -0,0 +1,10 @@
+name: local-vllm
+distribution_spec:
+  description: Like local, but use vLLM for running LLM inference
+  providers:
+    inference: vllm
+    memory: meta-reference
+    safety: meta-reference
+    agents: meta-reference
+    telemetry: meta-reference
+image_type: conda