Use inference APIs for executing Llama Guard (#121)

We should use Inference APIs to execute Llama Guard instead of directly needing to use HuggingFace modeling related code. The actual inference consideration is handled by Inference.
2024-09-28 15:40:06 -07:00 · 2024-09-28 15:40:06 -07:00 · 0a3999a9a4
commit 0a3999a9a4
parent 6236634d84
9 changed files with 167 additions and 204 deletions
--- a/llama_stack/providers/impls/meta_reference/safety/safety.py
+++ b/llama_stack/providers/impls/meta_reference/safety/safety.py
@ -88,10 +88,10 @@ class MetaReferenceSafetyImpl(Safety):
            assert (
                cfg is not None
            ), "Cannot use LlamaGuardShield since not present in config"
-            model_dir = resolve_and_get_path(cfg.model)

            return LlamaGuardShield(
-                model_dir=model_dir,
+                model=cfg.model,
+                inference_api=self.inference_api,
                excluded_categories=cfg.excluded_categories,
                disable_input_check=cfg.disable_input_check,
                disable_output_check=cfg.disable_output_check,