Commit graph

3854 commits

Author SHA1 Message Date
Krrish Dholakia
583571a43a bump: version 1.7.21 → 1.8.0 2023-11-29 20:14:54 -08:00
Krrish Dholakia
0d200cd8dc feat(main.py): allow updating model cost via completion() 2023-11-29 20:14:39 -08:00
ishaan-jaff
4c1ef4e270 (chore) fix testing 2023-11-29 20:05:13 -08:00
Krrish Dholakia
50cc4a8595 fix(proxy_server.py): have /health and /routes be router endpoints 2023-11-29 19:59:56 -08:00
ishaan-jaff
4ed5b3b46d (chore) linting fix 2023-11-29 19:58:12 -08:00
Krrish Dholakia
a9fdae0d23 test(test_streaming.py): refactor testing 2023-11-29 19:58:04 -08:00
Krrish Dholakia
7b53bf7d9a bump: version 1.7.20 → 1.7.21 2023-11-29 19:58:04 -08:00
Krrish Dholakia
c312ac4ca8 fix(main.py): don't pass stream to petals 2023-11-29 19:58:04 -08:00
ishaan-jaff
9780efca4b (feat) router: async client Azure, OpenAI 2023-11-29 19:45:08 -08:00
Krrish Dholakia
2760cdcce5 bump: version 1.7.19 → 1.7.20 2023-11-29 19:44:16 -08:00
Krrish Dholakia
1f5a1122fc fix(replicate.py): fix custom prompt formatting 2023-11-29 19:44:09 -08:00
ishaan-jaff
c05da0797b (feat) Embedding: Async Azure 2023-11-29 19:43:47 -08:00
ishaan-jaff
53554bae85 (test) aembedding 2023-11-29 19:36:42 -08:00
ishaan-jaff
10e21ae978 (test) aembedding 2023-11-29 19:35:32 -08:00
ishaan-jaff
09caab549a (feat) async embeddings: OpenAI 2023-11-29 19:35:08 -08:00
ishaan-jaff
3891462b29 (fix) router: azure/embedding support 2023-11-29 19:06:36 -08:00
ishaan-jaff
e58b3d5df0 (feat) add azure/gpt-4-1106-preview 2023-11-29 18:21:31 -08:00
ishaan-jaff
7bcc23e8e9 (fix) router: set default rpm/tpm when not set 2023-11-29 18:13:27 -08:00
ishaan-jaff
c1914a01bc (docs) routing 2023-11-29 18:09:39 -08:00
ishaan-jaff
f299120394 (docs) router 2023-11-29 18:08:00 -08:00
ishaan-jaff
305faab542 (test) router:get_available_deployment 2023-11-29 17:54:41 -08:00
ishaan-jaff
23af756531 (feat) router: random pick based on tpm/rpm 2023-11-29 17:54:06 -08:00
ishaan-jaff
2c74dbed17 (chore) util: remove_model_id 2023-11-29 17:30:33 -08:00
ishaan-jaff
7a38a45d62 (test) test weighted selection router 2023-11-29 17:30:18 -08:00
ishaan-jaff
48416f8018 (test) add rpm to load test profiling 2023-11-29 17:14:34 -08:00
ishaan-jaff
088d2bc081 (fix) use weighted shuffle when rpm set 2023-11-29 17:13:11 -08:00
Krrish Dholakia
38efc21f81 bump: version 1.7.18 → 1.7.19 2023-11-29 16:50:11 -08:00
Krrish Dholakia
61185aa12c fix(main.py): fix null finish reason issue for ollama 2023-11-29 16:50:11 -08:00
ishaan-jaff
69eca78000 (docs) simple proxy 2023-11-29 16:44:40 -08:00
Krrish Dholakia
c2f642dbec bump: version 1.7.17 → 1.7.18 2023-11-29 16:43:11 -08:00
Krrish Dholakia
5411d5a6fd fix(utils.py): raise stop iteration exception on bedrock stream close 2023-11-29 16:43:11 -08:00
Ishaan Jaff
286ce586be
Update README.md 2023-11-29 16:40:16 -08:00
ishaan-jaff
4b78481fbd (docs) simple proxy 2023-11-29 16:38:36 -08:00
ishaan-jaff
2d0432c5b7 (docs) simple proxy 2023-11-29 16:36:07 -08:00
Krrish Dholakia
52c9159a54 bump: version 1.7.16 → 1.7.17 2023-11-29 16:35:06 -08:00
Krrish Dholakia
ab76daa90b fix(bedrock.py): support ai21 / bedrock streaming 2023-11-29 16:35:06 -08:00
ishaan-jaff
3b89cff65e (docs) simple proxy 2023-11-29 16:33:00 -08:00
ishaan-jaff
032cd0121b (docs) simple proxy 2023-11-29 16:31:08 -08:00
ishaan-jaff
3cc8305ec6 (fix) proxy: /health 2023-11-29 16:23:37 -08:00
ishaan-jaff
d3672452ce (test) 1k requests 2023-11-29 16:22:18 -08:00
ishaan-jaff
3c6764efef (feat) proxy+ router: support 1k request/second 2023-11-29 16:22:04 -08:00
ishaan-jaff
da75b15176 (feat) completion: add rpm, tpm as litellm params 2023-11-29 16:19:05 -08:00
ishaan-jaff
9bf603889f (fix) azure: remove max retries before completion 2023-11-29 16:09:31 -08:00
ishaan-jaff
66bc0fc343 (fix) proxy: /health works with router updates 2023-11-29 16:09:31 -08:00
ishaan-jaff
8a398a1777 (feat) proxy: add weighted shuffle + set cooldown to 1s 2023-11-29 16:09:31 -08:00
ishaan-jaff
2bbd9c063d (fix) OpenAI embedding 2023-11-29 16:09:31 -08:00
Krrish Dholakia
3c254cc555 bump: version 1.7.15 → 1.7.16 2023-11-29 16:04:22 -08:00
Krrish Dholakia
96a27ce954 fix(utils.py): stop sequence filtering for amazon titan models 2023-11-29 16:04:14 -08:00
Krrish Dholakia
dccfe1cc3e bump: version 1.7.14 → 1.7.15 2023-11-29 15:37:21 -08:00
Krrish Dholakia
04a1c20bc5 fix(router.py): skip api key when generating model id for router deployments 2023-11-29 15:37:08 -08:00