Commit graph

10 commits

Author SHA1 Message Date
Xi Yan
cccd5be090 move eval_task_config to client 2024-10-15 10:14:35 -07:00
Xi Yan
9cc0a54f0b rag correctness scorer w/ custom dataset 2024-10-15 00:42:03 -07:00
Xi Yan
ec6c63ba57 dataset accept file uploads 2024-10-14 23:36:15 -07:00
Xi Yan
3c29108b6e input query optional input for braintrust scorer 2024-10-14 21:17:16 -07:00
Xi Yan
c8f6849291 full accuracy 2024-10-14 20:42:22 -07:00
Xi Yan
fcb8dea1ef scorer only api 2024-10-14 17:46:29 -07:00
Xi Yan
9c501d042b cleanup hardcoded dataset registry 2024-10-14 14:19:15 -07:00
Xi Yan
a25aff290e generator + scorer Api for MMLU 2024-10-13 23:27:02 -07:00
Xi Yan
fb565dfb06 eleuther eval fix 2024-10-11 09:30:10 -07:00
Xi Yan
31c046dcdf evals new rebase 2024-10-10 11:35:26 -07:00