Xi Yan
|
490f7e9a75
|
refactor
|
2024-11-08 14:19:55 -08:00 |
|
Xi Yan
|
9d04f11543
|
remove todo
|
2024-11-08 11:43:15 -08:00 |
|
Xi Yan
|
58c6138df1
|
move dataset to hf llamastack repo
|
2024-11-08 11:42:16 -08:00 |
|
Xi Yan
|
d42774c41b
|
msg
|
2024-11-07 21:36:49 -08:00 |
|
Xi Yan
|
989f070bc0
|
move benchmark task def to file
|
2024-11-07 21:35:02 -08:00 |
|
Xi Yan
|
6525b43906
|
refactor
|
2024-11-07 18:41:33 -08:00 |
|
Xi Yan
|
edeb6dcf04
|
mmlu loose
|
2024-11-07 18:36:41 -08:00 |
|
Xi Yan
|
6ee02ca23b
|
fix
|
2024-11-07 18:25:39 -08:00 |
|
Xi Yan
|
33b6d9b7b7
|
merge
|
2024-11-07 18:15:13 -08:00 |
|
Xi Yan
|
3c17853d79
|
register task required
|
2024-11-07 16:42:44 -08:00 |
|
Xi Yan
|
94a56cc3f3
|
register task required
|
2024-11-07 16:41:23 -08:00 |
|
Xi Yan
|
7ca479f400
|
fix optional
|
2024-11-07 16:22:33 -08:00 |
|
Xi Yan
|
fd581c3d88
|
only keep 1 run_eval
|
2024-11-07 16:17:49 -08:00 |
|
Xi Yan
|
37d87c585a
|
wip huggingface register
|
2024-11-07 15:59:55 -08:00 |
|
Xi Yan
|
d1633dc412
|
huggingface provider
|
2024-11-07 15:20:22 -08:00 |
|
Xi Yan
|
cc6edf6287
|
Merge branch 'eval_task_register' into mmlu_benchmark
|
2024-11-07 14:41:50 -08:00 |
|
Xi Yan
|
f05db9a25c
|
add eval_id for jobs
|
2024-11-07 14:30:46 -08:00 |
|
Xi Yan
|
ea80f623fb
|
add default task_eval_id for routing
|
2024-11-07 14:19:33 -08:00 |
|
Xi Yan
|
51c20f9c29
|
api refactor
|
2024-11-07 13:54:26 -08:00 |
|
Xi Yan
|
97dcd5704c
|
Merge branch 'main' into eval_task_register
|
2024-11-07 13:08:58 -08:00 |
|
Ashwin Bharambe
|
694c142b89
|
Add provider deprecation support; change directory structure (#397)
* Add provider deprecation support; change directory structure
* fix a couple dangling imports
* move the meta_reference safety dir also
|
2024-11-07 13:04:53 -08:00 |
|
Xi Yan
|
93995ecc4c
|
test wip
|
2024-11-07 11:11:27 -08:00 |
|
Xi Yan
|
3322aa9ee4
|
mmlu scoring fn
|
2024-11-07 10:54:00 -08:00 |
|
Xi Yan
|
b946afddc0
|
datasetdef files
|
2024-11-07 10:28:51 -08:00 |
|
Xi Yan
|
d75095033d
|
huggingface provider
|
2024-11-07 10:21:25 -08:00 |
|
Xi Yan
|
3f1ac29d57
|
test eval works
|
2024-11-06 21:40:38 -08:00 |
|
Xi Yan
|
413a1b6d8b
|
fix eval
|
2024-11-06 21:10:54 -08:00 |
|
Xi Yan
|
56239fce90
|
scoring fix
|
2024-11-06 18:07:16 -08:00 |
|
Xi Yan
|
0bce74402f
|
scoring test pass
|
2024-11-06 17:27:55 -08:00 |
|
Xi Yan
|
def6d5d8ad
|
scoring resolve
|
2024-11-06 17:04:25 -08:00 |
|
Xi Yan
|
8fc2d212a2
|
fix safety signature mismatch (#388)
* fix safety sig
* shield_type->identifier
|
2024-11-06 16:30:47 -08:00 |
|
Ashwin Bharambe
|
994732e2e0
|
impls -> inline , adapters -> remote (#381)
|
2024-11-06 14:54:05 -08:00 |
|