[Evals API][7/n] braintrust scoring provider (#333)

* wip scoring refactor * llm as judge, move folders * test full generation + eval * extract score regex to llm context * remove prints, cleanup braintrust in this branch * braintrust skeleton * datasetio test fix * braintrust provider * remove prints * dependencies * change json -> class * json -> class * remove initialize * address nits * check identifier prefix * braintrust scoring identifier check, rebase * udpate MANIFEST * manifest * remove braintrust scoring_fn * remove comments * tests * imports fix
2025-12-03 09:53:45 +00:00 · 2024-10-28 18:59:35 -07:00 · 2024-10-28 18:59:35 -07:00 · ed833bb758
commit ed833bb758
parent ae671eaf7a
11 changed files with 274 additions and 15 deletions
--- a/llama_stack/providers/registry/scoring.py
+++ b/llama_stack/providers/registry/scoring.py
@ -23,4 +23,15 @@ def available_providers() -> List[ProviderSpec]:
                Api.inference,
            ],
        ),
+        InlineProviderSpec(
+            api=Api.scoring,
+            provider_type="braintrust",
+            pip_packages=["autoevals", "openai"],
+            module="llama_stack.providers.impls.braintrust.scoring",
+            config_class="llama_stack.providers.impls.braintrust.scoring.BraintrustScoringConfig",
+            api_dependencies=[
+                Api.datasetio,
+                Api.datasets,
+            ],
+        ),
    ]