llama-stack/tests
Xi Yan cb84034567
[Evals API][3/n] scoring_functions / scoring meta-reference implementations (#296)
* wip

* dataset validation

* test_scoring

* cleanup

* clean up test

* comments

* error checking

* dataset client

* test client:

* datasetio client

* clean up

* basic scoring function works

* scorer wip

* equality scorer

* score batch impl

* score batch

* update scoring test

* refactor

* validate scorer input

* address comments

* add all rows scores to ScoringResult

* bugfix

* scoring function def rename
2024-10-24 14:52:30 -07:00
..
examples [Evals API][3/n] scoring_functions / scoring meta-reference implementations (#296) 2024-10-24 14:52:30 -07:00
example_custom_tool.py API Updates (#73) 2024-09-17 19:51:35 -07:00
test_bedrock_inference.py Bump version to 0.0.24 (#94) 2024-09-25 09:31:12 -07:00
test_e2e.py Support for Llama3.2 models and Swift SDK (#98) 2024-09-25 10:29:58 -07:00
test_inference.py minor typo and HuggingFace -> Hugging Face (#113) 2024-09-26 09:48:23 -07:00
test_ollama_inference.py Support for Llama3.2 models and Swift SDK (#98) 2024-09-25 10:29:58 -07:00