Xi Yan
32a496ab0f
Merge branch 'evals_5' into evals_6
2024-10-24 12:01:41 -07:00
Xi Yan
a3a8f32541
add all rows scores to ScoringResult
2024-10-24 11:53:15 -07:00
Xi Yan
737fcb795f
evals with generation
2024-10-24 11:30:13 -07:00
Xi Yan
071dba8871
Merge branch 'main' into evals_5
2024-10-24 09:18:15 -07:00
Ashwin Bharambe
8aa8847b4a
Bump version to 0.0.44
2024-10-24 08:41:39 -07:00
Ashwin Bharambe
7afe51c84d
New quantized models ( #301 )
2024-10-24 08:38:56 -07:00
Xi Yan
afa0c2b146
address comments
2024-10-23 22:17:38 -07:00
Ashwin Bharambe
05a8d47b98
Add a meta-reference-quantized-gpu distribution
2024-10-23 21:45:50 -07:00
Xi Yan
f5dcc03742
use pytorch/pytorch as base
2024-10-23 20:22:00 -07:00
Xi Yan
59c93548bc
validate scorer input
2024-10-23 17:43:41 -07:00
Xi Yan
0ee82571a8
refactor
2024-10-23 17:30:10 -07:00
Xi Yan
7c803cef86
update scoring test
2024-10-23 17:22:48 -07:00
Xi Yan
3c6555c408
score batch
2024-10-23 16:38:00 -07:00
Xi Yan
eb572faf6f
score batch impl
2024-10-23 16:19:25 -07:00
Xi Yan
4b1d7da030
equality scorer
2024-10-23 16:07:17 -07:00
Xi Yan
cad8c8710b
Merge branch 'main' into evals_5
2024-10-23 15:33:36 -07:00
Xi Yan
caf253e08f
Merge branch 'main' into evals_5
2024-10-23 15:33:00 -07:00
Xi Yan
0cec86453b
Fix issue w/ routing_table api getting added when router api is not specified ( #298 )
...
* fix issue w/ enforcing api
* cleanup
* inference only yaml
2024-10-23 15:27:22 -07:00
Xi Yan
35981a1a3b
scorer wip
2024-10-23 15:02:54 -07:00
Xi Yan
70c08e694d
basic scoring function works
2024-10-23 14:42:28 -07:00
Xi Yan
38e31ab525
clean up
2024-10-23 14:08:21 -07:00
Xi Yan
5930a92dc7
datasetio client
2024-10-23 14:04:51 -07:00
Xi Yan
51d5ad67c4
test client:
2024-10-23 13:55:55 -07:00
Xi Yan
bb43369521
dataset client
2024-10-23 13:53:58 -07:00
Xi Yan
c5db025320
error checking
2024-10-23 13:17:47 -07:00
Xi Yan
d8bbce6f7c
comments
2024-10-23 13:16:08 -07:00
Xi Yan
5e1323b5bf
clean up test
2024-10-23 13:08:42 -07:00
Xi Yan
555f6e1531
cleanup
2024-10-23 13:07:15 -07:00
Xi Yan
92e32f80ad
test_scoring
2024-10-23 13:01:49 -07:00
Xi Yan
7c280e18fb
dataset validation
2024-10-23 12:08:39 -07:00
Dinesh Yeduguru
21f2e9adf5
dont set num_predict for all providers ( #294 )
2024-10-23 11:44:04 -07:00
Ashwin Bharambe
ffb561070d
Support structured output for Together ( #289 )
2024-10-22 22:36:38 -07:00
Xi Yan
aefa84e70a
wip
2024-10-22 20:00:43 -07:00
Sarthak Deshpande
2e5e46d896
Added tests for persistence ( #274 )
2024-10-22 19:41:46 -07:00
Xi Yan
821810657f
[Evals API][2/n] datasets / datasetio meta-reference implementation ( #288 )
...
* skeleton dataset / datasetio
* dataset datasetio
* config
* address comments
* delete dataset_utils
* address comments
* naming fix
2024-10-22 16:12:16 -07:00
Sarthak Deshpande
8a01b9e40c
Added implementations for get_agents_session, delete_agents_session and delete_agents ( #267 )
2024-10-22 13:50:43 -07:00
Suraj Subramanian
b81a3bd46a
Fix import conflict for SamplingParams ( #285 )
...
Conflict between llama_models.llama3.api.datatypes.SamplingParams and vllm.sampling_params.SamplingParams results in errors while processing VLLM engine requests
2024-10-22 12:56:00 -07:00
Ashwin Bharambe
c06718fbd5
Add support for Structured Output / Guided decoding ( #281 )
...
Added support for structured output in the API and added a reference implementation for meta-reference.
A few notes:
* Two formats are specified in the API: Json schema and EBNF based grammar
* Implementation only supports Json for now
We use lm-format-enhancer to provide the implementation right now but may change this especially because BNF grammars aren't supported by that library.
Fireworks has support for structured output and Together has limited supported for it too. Subsequent PRs will add these changes. We would like all our inference providers to provide structured output for llama models since it is an extremely important and highly sought-after need by the developers.
2024-10-22 12:53:34 -07:00
Anush
4c3d33e6f4
feat: Qdrant Vector index support ( #221 )
...
This PR adds support for Qdrant - https://qdrant.tech/ to be used as a vector memory.
I've unit-tested the methods to confirm that they work as intended.
To run Qdrant
```
docker run -p 6333:6333 qdrant/qdrant
```
2024-10-22 12:50:19 -07:00
Suraj Subramanian
668a495aba
Add REST api example for chat_completion ( #286 )
2024-10-22 10:35:20 -07:00
Xi Yan
e45f121c77
[Evals API] [1/n] Initial API ( #287 )
...
* type system api
* datasets api
* fix
* datasetio api
* kill reward scoring
* scoring functions + evals
* move jobs, fix errors
2024-10-22 09:31:19 -07:00
Xi Yan
b279d3bc58
Update README.md
2024-10-22 08:01:33 -07:00
Dinesh Yeduguru
1d241bf3fe
add completion() for ollama ( #280 )
2024-10-21 22:26:33 -07:00
raghotham
e2a5a2e10d
first version of readthedocs ( #278 )
2024-10-22 10:15:58 +05:30
Xi Yan
dbb5ce43fc
Bump version to 0.0.43
2024-10-21 19:10:01 -07:00
Xi Yan
a2ff74a686
telemetry WARNING->WARN fix
2024-10-21 18:52:48 -07:00
Xi Yan
b1451afbc8
Update README.md
2024-10-21 18:21:30 -07:00
Xi Yan
4d2bd2d39e
add more distro templates ( #279 )
...
* verify dockers
* together distro verified
* readme
* fireworks distro
* fireworks compose up
* fireworks verified
2024-10-21 18:15:08 -07:00
Xi Yan
cf27d19dd5
fix sse_generator async
2024-10-21 14:03:42 -07:00
Ashwin Bharambe
1944405dca
Update new_api_provider.md
2024-10-21 14:02:51 -07:00