Commit graph

14 commits

Author SHA1 Message Date
Xi Yan
3acd37bcb3 remove type ignore 2024-11-07 11:15:36 -08:00
Xi Yan
f778b907e4 wip 2024-11-06 12:52:02 -08:00
Xi Yan
683a370d23 wip tests 2024-11-06 10:03:49 -08:00
Xi Yan
be7b76ceac rename 2024-11-05 17:08:32 -08:00
Xi Yan
e5b4e4d569 api name 2024-11-05 17:01:05 -08:00
Xi Yan
4a64f98c82 separate benchmark / app eval 2024-11-05 16:54:31 -08:00
Xi Yan
979cd4cd44 naming fix 2024-11-05 16:20:16 -08:00
Xi Yan
04eebd8a36 pre commit 2024-11-05 15:06:44 -08:00
Xi Yan
60fc191308 evaluate api update 2024-11-05 15:03:12 -08:00
Xi Yan
1b62188c30 eval task 2024-11-05 14:59:50 -08:00
Xi Yan
bca96b5b35 eval api 2024-11-05 14:55:59 -08:00
Xi Yan
4fc92e52d7 wip 2024-11-05 11:23:24 -08:00
Xi Yan
abdf7cddf3
[Evals API][4/n] evals with generation meta-reference impl (#303)
* wip

* dataset validation

* test_scoring

* cleanup

* clean up test

* comments

* error checking

* dataset client

* test client:

* datasetio client

* clean up

* basic scoring function works

* scorer wip

* equality scorer

* score batch impl

* score batch

* update scoring test

* refactor

* validate scorer input

* address comments

* evals with generation

* add all rows scores to ScoringResult

* minor typing

* bugfix

* scoring function def rename

* rebase name

* refactor

* address comments

* Update iOS inference instructions for new quantization

* Small updates to quantization config

* Fix score threshold in faiss

* Bump version to 0.0.45

* Handle both ipv6 and ipv4 interfaces together

* update manifest for build templates

* Update getting_started.md

* chatcompletion & completion input type validation

* inclusion->subsetof

* error checking

* scoring_function -> scoring_fn rename, scorer -> scoring_fn rename

* address comments

* [Evals API][5/n] fixes to generate openapi spec (#323)

* generate openapi

* typing comment, dataset -> dataset_id

* remove custom type

* sample eval run.yaml

---------

Co-authored-by: Dalton Flanagan <6599399+dltn@users.noreply.github.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2024-10-25 13:12:39 -07:00
Xi Yan
e45f121c77
[Evals API] [1/n] Initial API (#287)
* type system api

* datasets api

* fix

* datasetio api

* kill reward scoring

* scoring functions + evals

* move jobs, fix errors
2024-10-22 09:31:19 -07:00