Commit graph

691 commits

Author SHA1 Message Date
ishaan-jaff
9780efca4b (feat) router: async client Azure, OpenAI 2023-11-29 19:45:08 -08:00
Krrish Dholakia
1f5a1122fc fix(replicate.py): fix custom prompt formatting 2023-11-29 19:44:09 -08:00
ishaan-jaff
3891462b29 (fix) router: azure/embedding support 2023-11-29 19:06:36 -08:00
ishaan-jaff
7bcc23e8e9 (fix) router: set default rpm/tpm when not set 2023-11-29 18:13:27 -08:00
ishaan-jaff
23af756531 (feat) router: random pick based on tpm/rpm 2023-11-29 17:54:06 -08:00
ishaan-jaff
088d2bc081 (fix) use weighted shuffle when rpm set 2023-11-29 17:13:11 -08:00
ishaan-jaff
3c6764efef (feat) proxy+ router: support 1k request/second 2023-11-29 16:22:04 -08:00
ishaan-jaff
8a398a1777 (feat) proxy: add weighted shuffle + set cooldown to 1s 2023-11-29 16:09:31 -08:00
Krrish Dholakia
04a1c20bc5 fix(router.py): skip api key when generating model id for router deployments 2023-11-29 15:37:08 -08:00
Krrish Dholakia
383dd53e86 fix(main.py): passing client as a litellm-specific kwarg 2023-11-28 21:20:05 -08:00
ishaan-jaff
afd20098be (feat) router: init client for OpenAI compatible providers 2023-11-28 17:49:53 -08:00
Krrish Dholakia
bb1267eb07 fix(router.py): fix exponential backoff to use retry-after if present in headers 2023-11-28 17:25:03 -08:00
ishaan-jaff
f5b558dde0 (fix) router red api_key, api_base, api_version 2023-11-28 17:10:20 -08:00
ishaan-jaff
282b9a37e5 (fix) router: passing client 2023-11-28 16:34:16 -08:00
ishaan-jaff
4d06c296e3 (router) re use client across requests 2023-11-28 16:21:16 -08:00
ishaan-jaff
94d35f1ec5 (feat) router: re-use the same client for high trafic 2023-11-28 15:44:56 -08:00
ishaan-jaff
2a69cab550 (feat) router track total, success, failed calls per model 2023-11-28 15:44:56 -08:00
Krrish Dholakia
094144de58 fix(router.py): removing model id before making call 2023-11-28 10:09:45 -08:00
Krrish Dholakia
6149642295 refactor(router.py): fix linting errors 2023-11-27 22:11:53 -08:00
Krrish Dholakia
82d79638d4 refactor(router.py): fix linting errors 2023-11-27 22:08:48 -08:00
Krrish Dholakia
c4aea7432f build: adding debug logs to gitignore 2023-11-27 22:05:07 -08:00
Krrish Dholakia
be9fa06da6 fix(main.py): fix linting errors 2023-11-27 19:11:38 -08:00
ishaan-jaff
50733363ee (feat) use api_base, api_key as model 2023-11-27 18:08:47 -08:00
Krrish Dholakia
04f745e314 fix(router.py): speed improvements to the router 2023-11-27 17:35:26 -08:00
ishaan-jaff
4265f9b2ef (fix) router: allow same model/name 2023-11-27 16:26:09 -08:00
Krrish Dholakia
59ba1560e5 fix(router.py): fix fallbacks 2023-11-25 19:34:20 -08:00
Krrish Dholakia
fa713abfc3 fix(router.py): check for fallbacks in completion params for router 2023-11-25 18:46:45 -08:00
Krrish Dholakia
e4f302a8e2 fix(proxy_server.py): expose a /health endpoint 2023-11-25 18:28:47 -08:00
Krrish Dholakia
ab0bc87427 fix(router.py): check if fallbacks is none 2023-11-25 14:58:07 -08:00
Krrish Dholakia
95579fda7d fix(utils.py): fix bedrock + cohere calls 2023-11-25 14:45:42 -08:00
Krrish Dholakia
d62da29cbe fix: fix linting issues 2023-11-24 15:46:25 -08:00
Krrish Dholakia
2686894823 fix(router.py): fix retry logic 2023-11-24 13:27:44 -08:00
Krrish Dholakia
16e1070dbe test: refactor testing order 2023-11-24 12:47:28 -08:00
Krrish Dholakia
12dbdc4c15 docs(simple_proxy.md): add tutorial for doing fallbacks + retries + timeouts on the proxy 2023-11-24 12:20:38 -08:00
Krrish Dholakia
5c18771f9d fix(router.py): fixing embedding call 2023-11-23 21:07:02 -08:00
Krrish Dholakia
02464f6661 fix(router.py): use an older version of async for compatibility 2023-11-23 21:00:53 -08:00
Krrish Dholakia
187403c5cc fix(router.py): add modelgroup to call metadata 2023-11-23 20:55:49 -08:00
Krrish Dholakia
7d221fe863 fix(utils.py): make failure logging sync 2023-11-23 20:19:27 -08:00
Krrish Dholakia
dc17f63d0b fix(router.py): fix linting errors 2023-11-23 16:50:19 -08:00
Krrish Dholakia
c273d6f0d6 fix(router.py): add support for context window fallbacks on router 2023-11-23 16:43:02 -08:00
ishaan-jaff
f01865e960 (fix) router 2023-11-23 16:28:19 -08:00
Krrish Dholakia
afac42e93a fix(router.py): enable async completions with model fallbacks 2023-11-23 16:15:57 -08:00
Krrish Dholakia
8ac03e492f fix(router.py): enable fallbacks for sync completions 2023-11-23 16:06:46 -08:00
Krrish Dholakia
8c4e8d6c62 feat(proxy_server.py): add in-memory caching for user api keys 2023-11-23 13:21:45 -08:00
Krrish Dholakia
61fc76a8c4 fix(router.py): fix caching for tracking cooldowns + usage 2023-11-23 11:13:32 -08:00
Krrish Dholakia
2f93c0155a fix: fix linting errors 2023-11-22 19:59:25 -08:00
Krrish Dholakia
5d5ca9f7ef fix(router.py): add support for cooldowns with redis 2023-11-22 19:54:22 -08:00
Krrish Dholakia
3e76d4b422 feat(router.py): add server cooldown logic 2023-11-22 15:59:48 -08:00
Krrish Dholakia
76f46902ed feat(router.py): adding latency-based routing strategy 2023-11-21 21:19:27 -08:00
ishaan-jaff
80884e9cb3 (fix) using callbacks with router 2023-11-20 19:08:53 -08:00