Xi Yan
|
ed833bb758
|
[Evals API][7/n] braintrust scoring provider (#333)
* wip scoring refactor
* llm as judge, move folders
* test full generation + eval
* extract score regex to llm context
* remove prints, cleanup braintrust in this branch
* braintrust skeleton
* datasetio test fix
* braintrust provider
* remove prints
* dependencies
* change json -> class
* json -> class
* remove initialize
* address nits
* check identifier prefix
* braintrust scoring identifier check, rebase
* udpate MANIFEST
* manifest
* remove braintrust scoring_fn
* remove comments
* tests
* imports fix
|
2024-10-28 18:59:35 -07:00 |
|
Xi Yan
|
7b8748c53e
|
[Evals API][6/n] meta-reference llm as judge, registration for ScoringFnDefs (#330)
* wip scoring refactor
* llm as judge, move folders
* test full generation + eval
* extract score regex to llm context
* remove prints, cleanup braintrust in this branch
* change json -> class
* remove initialize
* address nits
* check identifier prefix
* udpate MANIFEST
|
2024-10-28 14:08:42 -07:00 |
|