From 08da5d003a6b43c06bbb8a8e9d178e6f726c7a03 Mon Sep 17 00:00:00 2001
From: Russell Bryant <rbryant@redhat.com>
Date: Sat, 28 Sep 2024 19:10:04 +0000
Subject: [PATCH] Add a local-vllm template

This is just like `local` using `meta-reference` for everything except
it uses `vllm` for inference.

Docker works, but So far, `conda` is a bit easier to use with the vllm
provider. The default container base image does not include all the
necessary libraries for all vllm features. More cuda dependencies are
necessary.

I started changing this base image used in this template, but it also
required changes to the Dockerfile, so it was getting too involved to
include in the first PR.

Signed-off-by: Russell Bryant <rbryant@redhat.com>
---
 .../distribution/templates/local-vllm-build.yaml       | 10 ++++++++++
 1 file changed, 10 insertions(+)
 create mode 100644 llama_stack/distribution/templates/local-vllm-build.yaml

diff --git a/llama_stack/distribution/templates/local-vllm-build.yaml b/llama_stack/distribution/templates/local-vllm-build.yaml
new file mode 100644
index 000000000..e907cb7c9
--- /dev/null
+++ b/llama_stack/distribution/templates/local-vllm-build.yaml
@@ -0,0 +1,10 @@
+name: local-vllm
+distribution_spec:
+  description: Like local, but use vLLM for running LLM inference
+  providers:
+    inference: vllm
+    memory: meta-reference
+    safety: meta-reference
+    agents: meta-reference
+    telemetry: meta-reference
+image_type: conda