feat(distro): no huggingface provider for starter (#3258)

The `trl` dependency brings in `accelerate` which brings in nvidia dependencies for torch. We cannot have that in the starter distro. As such, no CPU-only post-training for the huggingface provider.
2025-12-03 09:53:45 +00:00 · 2025-08-26 14:06:36 -07:00 · 2025-08-26 14:06:36 -07:00 · 9fa69b0337
commit 9fa69b0337
parent 00bd9a61ed
12 changed files with 35 additions and 55 deletions
--- a/llama_stack/providers/registry/inference.py
+++ b/llama_stack/providers/registry/inference.py
@ -40,8 +40,9 @@ def available_providers() -> list[ProviderSpec]:
        InlineProviderSpec(
            api=Api.inference,
            provider_type="inline::sentence-transformers",
+            # CrossEncoder depends on torchao.quantization
            pip_packages=[
-                "torch torchvision --index-url https://download.pytorch.org/whl/cpu",
+                "torch torchvision torchao>=0.12.0 --extra-index-url https://download.pytorch.org/whl/cpu",
                "sentence-transformers --no-deps",
            ],
            module="llama_stack.providers.inline.inference.sentence_transformers",