Xi Yan
e690eb7ad3
Merge branch 'main' into mmlu_benchmark
2024-11-11 10:22:32 -05:00
Dinesh Yeduguru
ec644d3418
migrate model to Resource and new registration signature ( #410 )
...
* resource oriented object design for models
* add back llama_model field
* working tests
* register singature fix
* address feedback
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-08 16:12:57 -08:00
Dinesh Yeduguru
d800a16acd
Resource oriented design for shields ( #399 )
...
* init
* working bedrock tests
* bedrock test for inference fixes
* use env vars for bedrock guardrail vars
* add register in meta reference
* use correct shield impl in meta ref
* dont add together fixture
* right naming
* minor updates
* improved registration flow
* address feedback
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-08 12:16:11 -08:00
Xi Yan
6192bf43a4
[Evals API][10/n] API updates for EvalTaskDef + new test migration ( #379 )
...
* wip
* scoring fn api
* eval api
* eval task
* evaluate api update
* pre commit
* unwrap context -> config
* config field doc
* typo
* naming fix
* separate benchmark / app eval
* api name
* rename
* wip tests
* wip
* datasetio test
* delete unused
* fixture
* scoring resolve
* fix scoring register
* scoring test pass
* score batch
* scoring fix
* fix eval
* test eval works
* remove type ignore
* api refactor
* add default task_eval_id for routing
* add eval_id for jobs
* remove type ignore
* only keep 1 run_eval
* fix optional
* register task required
* register task required
* delete old tests
* delete old tests
* fixture return impl
2024-11-07 21:24:12 -08:00
Xi Yan
4ae1d37c2f
msg
2024-11-07 18:43:21 -08:00
Xi Yan
33b6d9b7b7
merge
2024-11-07 18:15:13 -08:00
Xi Yan
94a56cc3f3
register task required
2024-11-07 16:41:23 -08:00
Xi Yan
7ca479f400
fix optional
2024-11-07 16:22:33 -08:00
Xi Yan
fd581c3d88
only keep 1 run_eval
2024-11-07 16:17:49 -08:00
Xi Yan
d1633dc412
huggingface provider
2024-11-07 15:20:22 -08:00
Xi Yan
6da74262ef
remove type ignore
2024-11-07 14:37:50 -08:00
Xi Yan
f05db9a25c
add eval_id for jobs
2024-11-07 14:30:46 -08:00
Xi Yan
ea80f623fb
add default task_eval_id for routing
2024-11-07 14:19:33 -08:00
Xi Yan
51c20f9c29
api refactor
2024-11-07 13:54:26 -08:00
Xi Yan
3acd37bcb3
remove type ignore
2024-11-07 11:15:36 -08:00
Xi Yan
413a1b6d8b
fix eval
2024-11-06 21:10:54 -08:00
Xi Yan
56239fce90
scoring fix
2024-11-06 18:07:16 -08:00
Xi Yan
1b7e19d5d0
Merge branch 'main' into eval_task_register
2024-11-06 15:05:46 -08:00
Dinesh Yeduguru
093c9f1987
add bedrock distribution code ( #358 )
...
* add bedrock distribution code
* fix linter error
* add bedrock shields support
* linter fixes
* working bedrock safety
* change to return only one violation
* remove env var reading
* refereshable boto credentials
* remove env vars
* address raghu's feedback
* fix session_ttl passing
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-06 14:39:11 -08:00
Xi Yan
f778b907e4
wip
2024-11-06 12:52:02 -08:00
Xi Yan
683a370d23
wip tests
2024-11-06 10:03:49 -08:00
Xi Yan
be7b76ceac
rename
2024-11-05 17:08:32 -08:00
Xi Yan
e5b4e4d569
api name
2024-11-05 17:01:05 -08:00
Xi Yan
4a64f98c82
separate benchmark / app eval
2024-11-05 16:54:31 -08:00
Xi Yan
979cd4cd44
naming fix
2024-11-05 16:20:16 -08:00
Xi Yan
9759911e47
typo
2024-11-05 16:06:40 -08:00
Xi Yan
f3955d04d7
config field doc
2024-11-05 16:06:02 -08:00
Xi Yan
be0649d79d
unwrap context -> config
2024-11-05 16:02:47 -08:00
Xi Yan
04eebd8a36
pre commit
2024-11-05 15:06:44 -08:00
Xi Yan
60fc191308
evaluate api update
2024-11-05 15:03:12 -08:00
Xi Yan
1b62188c30
eval task
2024-11-05 14:59:50 -08:00
Xi Yan
bca96b5b35
eval api
2024-11-05 14:55:59 -08:00
Xi Yan
fe91608321
scoring fn api
2024-11-05 14:34:56 -08:00
Xi Yan
4fc92e52d7
wip
2024-11-05 11:23:24 -08:00
Ashwin Bharambe
fb2678b134
Fix shield_type and routing table breakage
2024-11-04 19:57:15 -08:00
Dinesh Yeduguru
663883cc29
persist registered objects with distribution ( #354 )
...
* persist registered objects with distribution
* linter fixes
* comment
* use annotate and field discriminator
* workign tests
* donot use global state
* precommit failures fixed
* add back Any
* fix imports
* remove unnecessary changes in ollama
* precommit failures fixed
* make kvstore configurable for dist and rename registry
* add comment about registry list return
* fix linter errors
* use registry to hydrate
* remove debug print
* linter fixes
* remove kvstore.db
* rename distribution_registry_store
---------
Co-authored-by: Dinesh Yeduguru <dineshyv@fb.com>
2024-11-04 17:25:06 -08:00
Ashwin Bharambe
37b330b4ef
add dynamic clients for all APIs ( #348 )
...
* add dynamic clients for all APIs
* fix openapi generator
* inference + memory + agents tests now pass with "remote" providers
* Add docstring which fixes openapi generator :/
2024-10-31 14:46:25 -07:00
Ashwin Bharambe
26d1668f7d
Revert "remove Field for return_type"
...
This reverts commit ffb3965ade
.
2024-10-28 21:39:48 -07:00
Ashwin Bharambe
eccd7dc4a9
Avoid warnings from pydantic for overriding schema
...
Also fix structured output in completions
2024-10-28 21:39:48 -07:00
Xi Yan
7b8748c53e
[Evals API][6/n] meta-reference llm as judge, registration for ScoringFnDefs ( #330 )
...
* wip scoring refactor
* llm as judge, move folders
* test full generation + eval
* extract score regex to llm context
* remove prints, cleanup braintrust in this branch
* change json -> class
* remove initialize
* address nits
* check identifier prefix
* udpate MANIFEST
2024-10-28 14:08:42 -07:00
Xi Yan
ffb3965ade
remove Field for return_type
2024-10-28 13:04:41 -07:00
Xi Yan
abdf7cddf3
[Evals API][4/n] evals with generation meta-reference impl ( #303 )
...
* wip
* dataset validation
* test_scoring
* cleanup
* clean up test
* comments
* error checking
* dataset client
* test client:
* datasetio client
* clean up
* basic scoring function works
* scorer wip
* equality scorer
* score batch impl
* score batch
* update scoring test
* refactor
* validate scorer input
* address comments
* evals with generation
* add all rows scores to ScoringResult
* minor typing
* bugfix
* scoring function def rename
* rebase name
* refactor
* address comments
* Update iOS inference instructions for new quantization
* Small updates to quantization config
* Fix score threshold in faiss
* Bump version to 0.0.45
* Handle both ipv6 and ipv4 interfaces together
* update manifest for build templates
* Update getting_started.md
* chatcompletion & completion input type validation
* inclusion->subsetof
* error checking
* scoring_function -> scoring_fn rename, scorer -> scoring_fn rename
* address comments
* [Evals API][5/n] fixes to generate openapi spec (#323 )
* generate openapi
* typing comment, dataset -> dataset_id
* remove custom type
* sample eval run.yaml
---------
Co-authored-by: Dalton Flanagan <6599399+dltn@users.noreply.github.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2024-10-25 13:12:39 -07:00
Xi Yan
cb84034567
[Evals API][3/n] scoring_functions / scoring meta-reference implementations ( #296 )
...
* wip
* dataset validation
* test_scoring
* cleanup
* clean up test
* comments
* error checking
* dataset client
* test client:
* datasetio client
* clean up
* basic scoring function works
* scorer wip
* equality scorer
* score batch impl
* score batch
* update scoring test
* refactor
* validate scorer input
* address comments
* add all rows scores to ScoringResult
* bugfix
* scoring function def rename
2024-10-24 14:52:30 -07:00
Ashwin Bharambe
161aef0aae
Small updates to quantization config
2024-10-24 12:08:56 -07:00
Ashwin Bharambe
7afe51c84d
New quantized models ( #301 )
2024-10-24 08:38:56 -07:00
Xi Yan
821810657f
[Evals API][2/n] datasets / datasetio meta-reference implementation ( #288 )
...
* skeleton dataset / datasetio
* dataset datasetio
* config
* address comments
* delete dataset_utils
* address comments
* naming fix
2024-10-22 16:12:16 -07:00
Sarthak Deshpande
8a01b9e40c
Added implementations for get_agents_session, delete_agents_session and delete_agents ( #267 )
2024-10-22 13:50:43 -07:00
Ashwin Bharambe
c06718fbd5
Add support for Structured Output / Guided decoding ( #281 )
...
Added support for structured output in the API and added a reference implementation for meta-reference.
A few notes:
* Two formats are specified in the API: Json schema and EBNF based grammar
* Implementation only supports Json for now
We use lm-format-enhancer to provide the implementation right now but may change this especially because BNF grammars aren't supported by that library.
Fireworks has support for structured output and Together has limited supported for it too. Subsequent PRs will add these changes. We would like all our inference providers to provide structured output for llama models since it is an extremely important and highly sought-after need by the developers.
2024-10-22 12:53:34 -07:00
Xi Yan
e45f121c77
[Evals API] [1/n] Initial API ( #287 )
...
* type system api
* datasets api
* fix
* datasetio api
* kill reward scoring
* scoring functions + evals
* move jobs, fix errors
2024-10-22 09:31:19 -07:00
nehal-a2z
c995219731
Update event_logger.py ( #275 )
...
spelling error
2024-10-21 10:46:53 -07:00