Commit graph

8 commits

Author SHA1 Message Date
Xi Yan
ae75eb0f95 remove prints 2024-10-27 17:29:41 -07:00
Xi Yan
38186f7903 braintrust provider 2024-10-27 17:24:10 -07:00
Xi Yan
68346fac39 datasetio test fix 2024-10-27 16:43:02 -07:00
Xi Yan
d3d2243dfb braintrust skeleton 2024-10-27 12:32:07 -07:00
Xi Yan
91e5ad18b0 remove prints, cleanup braintrust in this branch 2024-10-25 17:39:05 -07:00
Xi Yan
16620a8185 llm as judge, move folders 2024-10-25 16:41:36 -07:00
Xi Yan
bf8bc7a781 wip scoring refactor 2024-10-25 15:03:03 -07:00
Xi Yan
cb84034567
[Evals API][3/n] scoring_functions / scoring meta-reference implementations (#296)
* wip

* dataset validation

* test_scoring

* cleanup

* clean up test

* comments

* error checking

* dataset client

* test client:

* datasetio client

* clean up

* basic scoring function works

* scorer wip

* equality scorer

* score batch impl

* score batch

* update scoring test

* refactor

* validate scorer input

* address comments

* add all rows scores to ScoringResult

* bugfix

* scoring function def rename
2024-10-24 14:52:30 -07:00