Commit graph

375 commits

Author SHA1 Message Date
Xi Yan
689990b48b Merge branch 'evals_5' into evals_6 2024-10-24 13:06:11 -07:00
Xi Yan
42bac85e1f bugfix 2024-10-24 12:16:28 -07:00
Xi Yan
24dce9cb7a minor typing 2024-10-24 12:08:57 -07:00
Xi Yan
32a496ab0f Merge branch 'evals_5' into evals_6 2024-10-24 12:01:41 -07:00
Xi Yan
a3a8f32541 add all rows scores to ScoringResult 2024-10-24 11:53:15 -07:00
Xi Yan
737fcb795f evals with generation 2024-10-24 11:30:13 -07:00
Xi Yan
071dba8871 Merge branch 'main' into evals_5 2024-10-24 09:18:15 -07:00
Ashwin Bharambe
8aa8847b4a Bump version to 0.0.44 2024-10-24 08:41:39 -07:00
Ashwin Bharambe
7afe51c84d
New quantized models (#301) 2024-10-24 08:38:56 -07:00
Xi Yan
afa0c2b146 address comments 2024-10-23 22:17:38 -07:00
Ashwin Bharambe
05a8d47b98 Add a meta-reference-quantized-gpu distribution 2024-10-23 21:45:50 -07:00
Xi Yan
f5dcc03742 use pytorch/pytorch as base 2024-10-23 20:22:00 -07:00
Xi Yan
59c93548bc validate scorer input 2024-10-23 17:43:41 -07:00
Xi Yan
0ee82571a8 refactor 2024-10-23 17:30:10 -07:00
Xi Yan
7c803cef86 update scoring test 2024-10-23 17:22:48 -07:00
Xi Yan
3c6555c408 score batch 2024-10-23 16:38:00 -07:00
Xi Yan
eb572faf6f score batch impl 2024-10-23 16:19:25 -07:00
Xi Yan
4b1d7da030 equality scorer 2024-10-23 16:07:17 -07:00
Xi Yan
cad8c8710b Merge branch 'main' into evals_5 2024-10-23 15:33:36 -07:00
Xi Yan
caf253e08f Merge branch 'main' into evals_5 2024-10-23 15:33:00 -07:00
Xi Yan
0cec86453b
Fix issue w/ routing_table api getting added when router api is not specified (#298)
* fix issue w/ enforcing api

* cleanup

* inference only yaml
2024-10-23 15:27:22 -07:00
Xi Yan
35981a1a3b scorer wip 2024-10-23 15:02:54 -07:00
Xi Yan
70c08e694d basic scoring function works 2024-10-23 14:42:28 -07:00
Xi Yan
38e31ab525 clean up 2024-10-23 14:08:21 -07:00
Xi Yan
5930a92dc7 datasetio client 2024-10-23 14:04:51 -07:00
Xi Yan
51d5ad67c4 test client: 2024-10-23 13:55:55 -07:00
Xi Yan
bb43369521 dataset client 2024-10-23 13:53:58 -07:00
Xi Yan
c5db025320 error checking 2024-10-23 13:17:47 -07:00
Xi Yan
d8bbce6f7c comments 2024-10-23 13:16:08 -07:00
Xi Yan
5e1323b5bf clean up test 2024-10-23 13:08:42 -07:00
Xi Yan
555f6e1531 cleanup 2024-10-23 13:07:15 -07:00
Xi Yan
92e32f80ad test_scoring 2024-10-23 13:01:49 -07:00
Xi Yan
7c280e18fb dataset validation 2024-10-23 12:08:39 -07:00
Dinesh Yeduguru
21f2e9adf5
dont set num_predict for all providers (#294) 2024-10-23 11:44:04 -07:00
Ashwin Bharambe
ffb561070d
Support structured output for Together (#289) 2024-10-22 22:36:38 -07:00
Xi Yan
aefa84e70a wip 2024-10-22 20:00:43 -07:00
Sarthak Deshpande
2e5e46d896
Added tests for persistence (#274) 2024-10-22 19:41:46 -07:00
Xi Yan
821810657f
[Evals API][2/n] datasets / datasetio meta-reference implementation (#288)
* skeleton dataset / datasetio

* dataset datasetio

* config

* address comments

* delete dataset_utils

* address comments

* naming fix
2024-10-22 16:12:16 -07:00
Sarthak Deshpande
8a01b9e40c
Added implementations for get_agents_session, delete_agents_session and delete_agents (#267) 2024-10-22 13:50:43 -07:00
Suraj Subramanian
b81a3bd46a
Fix import conflict for SamplingParams (#285)
Conflict between llama_models.llama3.api.datatypes.SamplingParams and vllm.sampling_params.SamplingParams results in errors while processing VLLM engine requests
2024-10-22 12:56:00 -07:00
Ashwin Bharambe
c06718fbd5
Add support for Structured Output / Guided decoding (#281)
Added support for structured output in the API and added a reference implementation for meta-reference.

A few notes:

* Two formats are specified in the API: Json schema and EBNF based grammar
* Implementation only supports Json for now
We use lm-format-enhancer to provide the implementation right now but may change this especially because BNF grammars aren't supported by that library.
Fireworks has support for structured output and Together has limited supported for it too. Subsequent PRs will add these changes. We would like all our inference providers to provide structured output for llama models since it is an extremely important and highly sought-after need by the developers.
2024-10-22 12:53:34 -07:00
Anush
4c3d33e6f4
feat: Qdrant Vector index support (#221)
This PR adds support for Qdrant - https://qdrant.tech/ to be used as a vector memory.

I've unit-tested the methods to confirm that they work as intended.

To run Qdrant

```
docker run -p 6333:6333 qdrant/qdrant
```
2024-10-22 12:50:19 -07:00
Suraj Subramanian
668a495aba
Add REST api example for chat_completion (#286) 2024-10-22 10:35:20 -07:00
Xi Yan
e45f121c77
[Evals API] [1/n] Initial API (#287)
* type system api

* datasets api

* fix

* datasetio api

* kill reward scoring

* scoring functions + evals

* move jobs, fix errors
2024-10-22 09:31:19 -07:00
Xi Yan
b279d3bc58
Update README.md 2024-10-22 08:01:33 -07:00
Dinesh Yeduguru
1d241bf3fe
add completion() for ollama (#280) 2024-10-21 22:26:33 -07:00
raghotham
e2a5a2e10d
first version of readthedocs (#278) 2024-10-22 10:15:58 +05:30
Xi Yan
dbb5ce43fc Bump version to 0.0.43 2024-10-21 19:10:01 -07:00
Xi Yan
a2ff74a686 telemetry WARNING->WARN fix 2024-10-21 18:52:48 -07:00
Xi Yan
b1451afbc8
Update README.md 2024-10-21 18:21:30 -07:00