llama-stack-mirror/llama_stack/providers/tests/datasetio/test_dataset.csv
Xi Yan cb84034567
[Evals API][3/n] scoring_functions / scoring meta-reference implementations (#296)
* wip

* dataset validation

* test_scoring

* cleanup

* clean up test

* comments

* error checking

* dataset client

* test client:

* datasetio client

* clean up

* basic scoring function works

* scorer wip

* equality scorer

* score batch impl

* score batch

* update scoring test

* refactor

* validate scorer input

* address comments

* add all rows scores to ScoringResult

* bugfix

* scoring function def rename
2024-10-24 14:52:30 -07:00

310 B

1input_querygenerated_answerexpected_answer
2What is the capital of France?LondonParis
3Who is the CEO of Meta?Mark ZuckerbergMark Zuckerberg
4What is the largest planet in our solar system?JupiterJupiter
5What is the smallest country in the world?ChinaVatican City
6What is the currency of Japan?YenYen