llama-stack-mirror/llama_stack
Xi Yan cb84034567
[Evals API][3/n] scoring_functions / scoring meta-reference implementations (#296)
* wip

* dataset validation

* test_scoring

* cleanup

* clean up test

* comments

* error checking

* dataset client

* test client:

* datasetio client

* clean up

* basic scoring function works

* scorer wip

* equality scorer

* score batch impl

* score batch

* update scoring test

* refactor

* validate scorer input

* address comments

* add all rows scores to ScoringResult

* bugfix

* scoring function def rename
2024-10-24 14:52:30 -07:00
..
apis [Evals API][3/n] scoring_functions / scoring meta-reference implementations (#296) 2024-10-24 14:52:30 -07:00
cli llama stack distributions / templates / docker refactor (#266) 2024-10-21 11:17:53 -07:00
distribution [Evals API][3/n] scoring_functions / scoring meta-reference implementations (#296) 2024-10-24 14:52:30 -07:00
providers [Evals API][3/n] scoring_functions / scoring meta-reference implementations (#296) 2024-10-24 14:52:30 -07:00
scripts Add a test for CLI, but not fully done so disabled 2024-09-19 13:27:07 -07:00
__init__.py API Updates (#73) 2024-09-17 19:51:35 -07:00