Commit graph

284 commits

Author SHA1 Message Date
Xi Yan
75ccc05296 rename 2024-11-11 10:48:47 -05:00
Xi Yan
1031f1404b add register model to unit test 2024-11-11 10:35:59 -05:00
Xi Yan
e690eb7ad3 Merge branch 'main' into mmlu_benchmark 2024-11-11 10:22:32 -05:00
Ashwin Bharambe
4986e46188
Distributions updates (slight updates to ollama, add inline-vllm and remote-vllm) (#408)
* remote vllm distro

* add inline-vllm details, fix things

* Write some docs
2024-11-08 18:09:39 -08:00
Xi Yan
ba82021d4b precommit 2024-11-08 17:58:58 -08:00
Xi Yan
1ebf6447c5 add missing inits 2024-11-08 17:54:24 -08:00
Xi Yan
89c3129f0b add missing inits 2024-11-08 17:49:29 -08:00
Dinesh Yeduguru
ec644d3418
migrate model to Resource and new registration signature (#410)
* resource oriented object design for models

* add back llama_model field

* working tests

* register singature fix

* address feedback

---------

Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-08 16:12:57 -08:00
Xi Yan
490f7e9a75 refactor 2024-11-08 14:19:55 -08:00
Dalton Flanagan
5625aef48a
Add pip install helper for test and direct scenarios (#404)
* initial branch commit

* pip install helptext

* remove print

* pre-commit
2024-11-08 15:18:21 -05:00
Dinesh Yeduguru
d800a16acd
Resource oriented design for shields (#399)
* init

* working bedrock tests

* bedrock test for inference fixes

* use env vars for bedrock guardrail vars

* add register in meta reference

* use correct shield impl in meta ref

* dont add together fixture

* right naming

* minor updates

* improved registration flow

* address feedback

---------

Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-08 12:16:11 -08:00
Xi Yan
9d04f11543 remove todo 2024-11-08 11:43:15 -08:00
Xi Yan
58c6138df1 move dataset to hf llamastack repo 2024-11-08 11:42:16 -08:00
Xi Yan
d42774c41b msg 2024-11-07 21:36:49 -08:00
Xi Yan
989f070bc0 move benchmark task def to file 2024-11-07 21:35:02 -08:00
Xi Yan
f429e75b3e fix tests 2024-11-07 21:31:05 -08:00
Xi Yan
0443b36cc1 merge 2024-11-07 21:27:08 -08:00
Xi Yan
6192bf43a4
[Evals API][10/n] API updates for EvalTaskDef + new test migration (#379)
* wip

* scoring fn api

* eval api

* eval task

* evaluate api update

* pre commit

* unwrap context -> config

* config field doc

* typo

* naming fix

* separate benchmark / app eval

* api name

* rename

* wip tests

* wip

* datasetio test

* delete unused

* fixture

* scoring resolve

* fix scoring register

* scoring test pass

* score batch

* scoring fix

* fix eval

* test eval works

* remove type ignore

* api refactor

* add default task_eval_id for routing

* add eval_id for jobs

* remove type ignore

* only keep 1 run_eval

* fix optional

* register task required

* register task required

* delete old tests

* delete old tests

* fixture return impl
2024-11-07 21:24:12 -08:00
Xi Yan
4ae1d37c2f msg 2024-11-07 18:43:21 -08:00
Xi Yan
6525b43906 refactor 2024-11-07 18:41:33 -08:00
Xi Yan
edeb6dcf04 mmlu loose 2024-11-07 18:36:41 -08:00
Xi Yan
6ee02ca23b fix 2024-11-07 18:25:39 -08:00
Xi Yan
33b6d9b7b7 merge 2024-11-07 18:15:13 -08:00
Xi Yan
027ee2335c delete old tests 2024-11-07 18:06:21 -08:00
Xi Yan
3c17853d79 register task required 2024-11-07 16:42:44 -08:00
Xi Yan
94a56cc3f3 register task required 2024-11-07 16:41:23 -08:00
Xi Yan
7ca479f400 fix optional 2024-11-07 16:22:33 -08:00
Xi Yan
fd581c3d88 only keep 1 run_eval 2024-11-07 16:17:49 -08:00
Xi Yan
37d87c585a wip huggingface register 2024-11-07 15:59:55 -08:00
Xi Yan
d1633dc412 huggingface provider 2024-11-07 15:20:22 -08:00
Xi Yan
cc6edf6287 Merge branch 'eval_task_register' into mmlu_benchmark 2024-11-07 14:41:50 -08:00
Xi Yan
6b889651d6 Merge branch 'main' into eval_task_register 2024-11-07 14:41:29 -08:00
Xi Yan
6da74262ef remove type ignore 2024-11-07 14:37:50 -08:00
Xi Yan
f05db9a25c add eval_id for jobs 2024-11-07 14:30:46 -08:00
Xi Yan
ea80f623fb add default task_eval_id for routing 2024-11-07 14:19:33 -08:00
Xi Yan
51c20f9c29 api refactor 2024-11-07 13:54:26 -08:00
Dalton Flanagan
345ae07317
Factor out create_dist_registry (#398) 2024-11-07 16:13:19 -05:00
Xi Yan
97dcd5704c Merge branch 'main' into eval_task_register 2024-11-07 13:08:58 -08:00
Ashwin Bharambe
694c142b89
Add provider deprecation support; change directory structure (#397)
* Add provider deprecation support; change directory structure

* fix a couple dangling imports

* move the meta_reference safety dir also
2024-11-07 13:04:53 -08:00
Xi Yan
36e2538eb0
fix together inference validator (#393) 2024-11-07 11:31:53 -08:00
Xi Yan
eaa6a29cef Merge branch 'main' into eval_task_register 2024-11-07 11:18:37 -08:00
Xi Yan
3acd37bcb3 remove type ignore 2024-11-07 11:15:36 -08:00
Xi Yan
93995ecc4c test wip 2024-11-07 11:11:27 -08:00
Xi Yan
3322aa9ee4 mmlu scoring fn 2024-11-07 10:54:00 -08:00
Xi Yan
b946afddc0 datasetdef files 2024-11-07 10:28:51 -08:00
Xi Yan
d75095033d huggingface provider 2024-11-07 10:21:25 -08:00
Yufei (Benny) Chen
31c5fbda5e
[LlamaStack][Fireworks] Update client and add unittest (#390) 2024-11-07 10:11:28 -08:00
Xi Yan
283b5c1def Merge branch 'main' into eval_task_register 2024-11-06 21:50:09 -08:00
Xi Yan
3f1ac29d57 test eval works 2024-11-06 21:40:38 -08:00
Xi Yan
413a1b6d8b fix eval 2024-11-06 21:10:54 -08:00