Commit graph

485 commits

Author SHA1 Message Date
Xi Yan
4ae1d37c2f msg 2024-11-07 18:43:21 -08:00
Xi Yan
6525b43906 refactor 2024-11-07 18:41:33 -08:00
Xi Yan
edeb6dcf04 mmlu loose 2024-11-07 18:36:41 -08:00
Xi Yan
6ee02ca23b fix 2024-11-07 18:25:39 -08:00
Xi Yan
33b6d9b7b7 merge 2024-11-07 18:15:13 -08:00
Xi Yan
027ee2335c delete old tests 2024-11-07 18:06:21 -08:00
Xi Yan
3c17853d79 register task required 2024-11-07 16:42:44 -08:00
Xi Yan
94a56cc3f3 register task required 2024-11-07 16:41:23 -08:00
Xi Yan
7ca479f400 fix optional 2024-11-07 16:22:33 -08:00
Xi Yan
fd581c3d88 only keep 1 run_eval 2024-11-07 16:17:49 -08:00
Xi Yan
37d87c585a wip huggingface register 2024-11-07 15:59:55 -08:00
Xi Yan
d1633dc412 huggingface provider 2024-11-07 15:20:22 -08:00
Xi Yan
cc6edf6287 Merge branch 'eval_task_register' into mmlu_benchmark 2024-11-07 14:41:50 -08:00
Xi Yan
6b889651d6 Merge branch 'main' into eval_task_register 2024-11-07 14:41:29 -08:00
Xi Yan
6da74262ef remove type ignore 2024-11-07 14:37:50 -08:00
Xi Yan
f05db9a25c add eval_id for jobs 2024-11-07 14:30:46 -08:00
Xi Yan
ea80f623fb add default task_eval_id for routing 2024-11-07 14:19:33 -08:00
Xi Yan
51c20f9c29 api refactor 2024-11-07 13:54:26 -08:00
Dalton Flanagan
345ae07317
Factor out create_dist_registry (#398) 2024-11-07 16:13:19 -05:00
Xi Yan
97dcd5704c Merge branch 'main' into eval_task_register 2024-11-07 13:08:58 -08:00
Ashwin Bharambe
694c142b89
Add provider deprecation support; change directory structure (#397)
* Add provider deprecation support; change directory structure

* fix a couple dangling imports

* move the meta_reference safety dir also
2024-11-07 13:04:53 -08:00
Xi Yan
36e2538eb0
fix together inference validator (#393) 2024-11-07 11:31:53 -08:00
Xi Yan
eaa6a29cef Merge branch 'main' into eval_task_register 2024-11-07 11:18:37 -08:00
Xi Yan
3acd37bcb3 remove type ignore 2024-11-07 11:15:36 -08:00
Xi Yan
93995ecc4c test wip 2024-11-07 11:11:27 -08:00
Xi Yan
3322aa9ee4 mmlu scoring fn 2024-11-07 10:54:00 -08:00
Xi Yan
b946afddc0 datasetdef files 2024-11-07 10:28:51 -08:00
Xi Yan
d75095033d huggingface provider 2024-11-07 10:21:25 -08:00
Yufei (Benny) Chen
31c5fbda5e
[LlamaStack][Fireworks] Update client and add unittest (#390) 2024-11-07 10:11:28 -08:00
Ashwin Bharambe
cfcc0a871c Slightly update PR template 2024-11-06 22:49:01 -08:00
Xi Yan
283b5c1def Merge branch 'main' into eval_task_register 2024-11-06 21:50:09 -08:00
Xi Yan
3f1ac29d57 test eval works 2024-11-06 21:40:38 -08:00
Xi Yan
413a1b6d8b fix eval 2024-11-06 21:10:54 -08:00
Ashwin Bharambe
489f74a70b Allow simpler initialization of RemoteProviderConfig; fix issue in httpx client 2024-11-06 19:19:26 -08:00
Xi Yan
56239fce90 scoring fix 2024-11-06 18:07:16 -08:00
Ashwin Bharambe
064d2a5287
Remove the safety adapter for Together; we can just use "meta-reference" (#387) 2024-11-06 17:36:57 -08:00
Xi Yan
c5cf9c30be score batch 2024-11-06 17:30:46 -08:00
Xi Yan
0bce74402f scoring test pass 2024-11-06 17:27:55 -08:00
Xi Yan
0351072531 fix scoring register 2024-11-06 17:18:16 -08:00
Xi Yan
def6d5d8ad scoring resolve 2024-11-06 17:04:25 -08:00
Xi Yan
c53733d1a3 fixture 2024-11-06 16:41:17 -08:00
Xi Yan
00869799a1 Merge branch 'main' into eval_task_register 2024-11-06 16:34:22 -08:00
Xi Yan
8fc2d212a2
fix safety signature mismatch (#388)
* fix safety sig

* shield_type->identifier
2024-11-06 16:30:47 -08:00
Ashwin Bharambe
7c340f0236 rename test_inference -> test_text_inference 2024-11-06 16:12:50 -08:00
Xi Yan
10eda0af59 delete unused 2024-11-06 16:08:04 -08:00
Ashwin Bharambe
3b54ce3499 remote::vllm now works with vision models 2024-11-06 16:07:17 -08:00
Xi Yan
1fe4099bd0 datasetio test 2024-11-06 16:00:38 -08:00
Xi Yan
1b7e19d5d0 Merge branch 'main' into eval_task_register 2024-11-06 15:05:46 -08:00
Ashwin Bharambe
994732e2e0
impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00
Ashwin Bharambe
b10e9f46bb
Enable remote::vllm (#384)
* Enable remote::vllm

* Kill the giant list of hard coded models
2024-11-06 14:42:44 -08:00