Xi Yan
|
94a56cc3f3
|
register task required
|
2024-11-07 16:41:23 -08:00 |
|
Xi Yan
|
fd581c3d88
|
only keep 1 run_eval
|
2024-11-07 16:17:49 -08:00 |
|
Xi Yan
|
f05db9a25c
|
add eval_id for jobs
|
2024-11-07 14:30:46 -08:00 |
|
Xi Yan
|
ea80f623fb
|
add default task_eval_id for routing
|
2024-11-07 14:19:33 -08:00 |
|
Xi Yan
|
51c20f9c29
|
api refactor
|
2024-11-07 13:54:26 -08:00 |
|
Xi Yan
|
3acd37bcb3
|
remove type ignore
|
2024-11-07 11:15:36 -08:00 |
|
Xi Yan
|
f778b907e4
|
wip
|
2024-11-06 12:52:02 -08:00 |
|
Xi Yan
|
683a370d23
|
wip tests
|
2024-11-06 10:03:49 -08:00 |
|
Xi Yan
|
be7b76ceac
|
rename
|
2024-11-05 17:08:32 -08:00 |
|
Xi Yan
|
e5b4e4d569
|
api name
|
2024-11-05 17:01:05 -08:00 |
|
Xi Yan
|
4a64f98c82
|
separate benchmark / app eval
|
2024-11-05 16:54:31 -08:00 |
|
Xi Yan
|
979cd4cd44
|
naming fix
|
2024-11-05 16:20:16 -08:00 |
|
Xi Yan
|
04eebd8a36
|
pre commit
|
2024-11-05 15:06:44 -08:00 |
|
Xi Yan
|
60fc191308
|
evaluate api update
|
2024-11-05 15:03:12 -08:00 |
|
Xi Yan
|
1b62188c30
|
eval task
|
2024-11-05 14:59:50 -08:00 |
|
Xi Yan
|
bca96b5b35
|
eval api
|
2024-11-05 14:55:59 -08:00 |
|
Xi Yan
|
4fc92e52d7
|
wip
|
2024-11-05 11:23:24 -08:00 |
|
Xi Yan
|
abdf7cddf3
|
[Evals API][4/n] evals with generation meta-reference impl (#303)
* wip
* dataset validation
* test_scoring
* cleanup
* clean up test
* comments
* error checking
* dataset client
* test client:
* datasetio client
* clean up
* basic scoring function works
* scorer wip
* equality scorer
* score batch impl
* score batch
* update scoring test
* refactor
* validate scorer input
* address comments
* evals with generation
* add all rows scores to ScoringResult
* minor typing
* bugfix
* scoring function def rename
* rebase name
* refactor
* address comments
* Update iOS inference instructions for new quantization
* Small updates to quantization config
* Fix score threshold in faiss
* Bump version to 0.0.45
* Handle both ipv6 and ipv4 interfaces together
* update manifest for build templates
* Update getting_started.md
* chatcompletion & completion input type validation
* inclusion->subsetof
* error checking
* scoring_function -> scoring_fn rename, scorer -> scoring_fn rename
* address comments
* [Evals API][5/n] fixes to generate openapi spec (#323)
* generate openapi
* typing comment, dataset -> dataset_id
* remove custom type
* sample eval run.yaml
---------
Co-authored-by: Dalton Flanagan <6599399+dltn@users.noreply.github.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
|
2024-10-25 13:12:39 -07:00 |
|
Xi Yan
|
e45f121c77
|
[Evals API] [1/n] Initial API (#287)
* type system api
* datasets api
* fix
* datasetio api
* kill reward scoring
* scoring functions + evals
* move jobs, fix errors
|
2024-10-22 09:31:19 -07:00 |
|