llama-stack/llama_stack/providers/tests/datasetio
Xi Yan ed833bb758
[Evals API][7/n] braintrust scoring provider (#333)
* wip scoring refactor

* llm as judge, move folders

* test full generation + eval

* extract score regex to llm context

* remove prints, cleanup braintrust in this branch

* braintrust skeleton

* datasetio test fix

* braintrust provider

* remove prints

* dependencies

* change json -> class

* json -> class

* remove initialize

* address nits

* check identifier prefix

* braintrust scoring identifier check, rebase

* udpate MANIFEST

* manifest

* remove braintrust scoring_fn

* remove comments

* tests

* imports fix
2024-10-28 18:59:35 -07:00
..
__init__.py [Evals API][2/n] datasets / datasetio meta-reference implementation (#288) 2024-10-22 16:12:16 -07:00
provider_config_example.yaml [Evals API][2/n] datasets / datasetio meta-reference implementation (#288) 2024-10-22 16:12:16 -07:00
test_dataset.csv [Evals API][4/n] evals with generation meta-reference impl (#303) 2024-10-25 13:12:39 -07:00
test_datasetio.py [Evals API][7/n] braintrust scoring provider (#333) 2024-10-28 18:59:35 -07:00