Distributions updates (slight updates to ollama, add inline-vllm and remote-vllm) (#408)

* remote vllm distro * add inline-vllm details, fix things * Write some docs
2024-11-08 18:09:39 -08:00 · 2024-11-08 18:09:39 -08:00 · 4986e46188
commit 4986e46188
parent ba82021d4b
19 changed files with 365 additions and 46 deletions
--- a/llama_stack/templates/inline-vllm/build.yaml
+++ b/llama_stack/templates/inline-vllm/build.yaml
@ -0,0 +1,13 @@
+name: meta-reference-gpu
+distribution_spec:
+  docker_image: pytorch/pytorch:2.5.0-cuda12.4-cudnn9-runtime
+  description: Use code from `llama_stack` itself to serve all llama stack APIs
+  providers:
+    inference: meta-reference
+    memory:
+    - meta-reference
+    - remote::chromadb
+    - remote::pgvector
+    safety: meta-reference
+    agents: meta-reference
+    telemetry: meta-reference