llama-stack-mirror/tests
Xi Yan 7b8748c53e
[Evals API][6/n] meta-reference llm as judge, registration for ScoringFnDefs (#330)
* wip scoring refactor

* llm as judge, move folders

* test full generation + eval

* extract score regex to llm context

* remove prints, cleanup braintrust in this branch

* change json -> class

* remove initialize

* address nits

* check identifier prefix

* udpate MANIFEST
2024-10-28 14:08:42 -07:00
..
examples [Evals API][6/n] meta-reference llm as judge, registration for ScoringFnDefs (#330) 2024-10-28 14:08:42 -07:00
example_custom_tool.py API Updates (#73) 2024-09-17 19:51:35 -07:00
test_bedrock_inference.py Bump version to 0.0.24 (#94) 2024-09-25 09:31:12 -07:00
test_e2e.py Support for Llama3.2 models and Swift SDK (#98) 2024-09-25 10:29:58 -07:00
test_inference.py minor typo and HuggingFace -> Hugging Face (#113) 2024-09-26 09:48:23 -07:00
test_ollama_inference.py Support for Llama3.2 models and Swift SDK (#98) 2024-09-25 10:29:58 -07:00