Xi Yan
|
2b7d70ba86
|
[Evals API][11/n] huggingface dataset provider + mmlu scoring fn (#392)
* wip
* scoring fn api
* eval api
* eval task
* evaluate api update
* pre commit
* unwrap context -> config
* config field doc
* typo
* naming fix
* separate benchmark / app eval
* api name
* rename
* wip tests
* wip
* datasetio test
* delete unused
* fixture
* scoring resolve
* fix scoring register
* scoring test pass
* score batch
* scoring fix
* fix eval
* test eval works
* huggingface provider
* datasetdef files
* mmlu scoring fn
* test wip
* remove type ignore
* api refactor
* add default task_eval_id for routing
* add eval_id for jobs
* remove type ignore
* huggingface provider
* wip huggingface register
* only keep 1 run_eval
* fix optional
* register task required
* register task required
* delete old tests
* fix
* mmlu loose
* refactor
* msg
* fix tests
* move benchmark task def to file
* msg
* gen openapi
* openapi gen
* move dataset to hf llamastack repo
* remove todo
* refactor
* add register model to unit test
* rename
* register to client
* delete preregistered dataset/eval task
* comments
* huggingface -> remote adapter
* openapi gen
|
2024-11-11 14:49:50 -05:00 |
|