llama-stack/llama_stack/templates/remote-vllm/build.yaml at f4426f6a4374449e7c2baa74d23c56f1e2bc8f11 - phoenix-oss/llama-stack - Git for basel.kvant.cloud

phoenix-oss/llama-stack

forked from phoenix-oss/llama-stack-mirror

Ashwin Bharambe c1f7ba3aed

Split safety into (llama-guard, prompt-guard, code-scanner) (#400 )

Splits the meta-reference safety implementation into three distinct providers:

- inline::llama-guard
- inline::prompt-guard
- inline::code-scanner

Note that this PR is a backward incompatible change to the llama stack server. I have added deprecation_error field to ProviderSpec -- the server reads it and immediately barfs. This is used to direct the user with a specific message on what action to perform. An automagical "config upgrade" is a bit too much work to implement right now :/

(Note that we will be gradually prefixing all inline providers with inline:: -- I am only doing this for this set of new providers because otherwise existing configuration files will break even more badly.)

2024-11-11 09:29:18 -08:00

12 lines

317 B

YAML

Raw Blame History

 name: remote-vllm
 distribution_spec:
   description: Use (an external) vLLM server for running LLM inference
   providers:
     inference: remote::vllm
     memory:
     - meta-reference
     - remote::chromadb
     - remote::pgvector
     safety: inline::llama-guard
     agents: meta-reference
     telemetry: meta-reference