mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-07-29 07:14:20 +00:00
Add a local-vllm template
This is just like `local` using `meta-reference` for everything except it uses `vllm` for inference. Docker works, but So far, `conda` is a bit easier to use with the vllm provider. The default container base image does not include all the necessary libraries for all vllm features. More cuda dependencies are necessary. I started changing this base image used in this template, but it also required changes to the Dockerfile, so it was getting too involved to include in the first PR. Signed-off-by: Russell Bryant <rbryant@redhat.com>
This commit is contained in:
parent
31a0c51dea
commit
08da5d003a
1 changed files with 10 additions and 0 deletions
10
llama_stack/distribution/templates/local-vllm-build.yaml
Normal file
10
llama_stack/distribution/templates/local-vllm-build.yaml
Normal file
|
@ -0,0 +1,10 @@
|
|||
name: local-vllm
|
||||
distribution_spec:
|
||||
description: Like local, but use vLLM for running LLM inference
|
||||
providers:
|
||||
inference: vllm
|
||||
memory: meta-reference
|
||||
safety: meta-reference
|
||||
agents: meta-reference
|
||||
telemetry: meta-reference
|
||||
image_type: conda
|
Loading…
Add table
Add a link
Reference in a new issue