ishaan-jaff
|
97ff0caf70
|
(feat) proxy: config - azure allow users to pass in base_url
|
2023-11-30 10:56:55 -08:00 |
|
Krrish Dholakia
|
c312ac4ca8
|
fix(main.py): don't pass stream to petals
|
2023-11-29 19:58:04 -08:00 |
|
ishaan-jaff
|
9780efca4b
|
(feat) router: async client Azure, OpenAI
|
2023-11-29 19:45:08 -08:00 |
|
Krrish Dholakia
|
1f5a1122fc
|
fix(replicate.py): fix custom prompt formatting
|
2023-11-29 19:44:09 -08:00 |
|
ishaan-jaff
|
3891462b29
|
(fix) router: azure/embedding support
|
2023-11-29 19:06:36 -08:00 |
|
ishaan-jaff
|
7bcc23e8e9
|
(fix) router: set default rpm/tpm when not set
|
2023-11-29 18:13:27 -08:00 |
|
ishaan-jaff
|
23af756531
|
(feat) router: random pick based on tpm/rpm
|
2023-11-29 17:54:06 -08:00 |
|
ishaan-jaff
|
088d2bc081
|
(fix) use weighted shuffle when rpm set
|
2023-11-29 17:13:11 -08:00 |
|
ishaan-jaff
|
3c6764efef
|
(feat) proxy+ router: support 1k request/second
|
2023-11-29 16:22:04 -08:00 |
|
ishaan-jaff
|
8a398a1777
|
(feat) proxy: add weighted shuffle + set cooldown to 1s
|
2023-11-29 16:09:31 -08:00 |
|
Krrish Dholakia
|
04a1c20bc5
|
fix(router.py): skip api key when generating model id for router deployments
|
2023-11-29 15:37:08 -08:00 |
|
Krrish Dholakia
|
383dd53e86
|
fix(main.py): passing client as a litellm-specific kwarg
|
2023-11-28 21:20:05 -08:00 |
|
ishaan-jaff
|
afd20098be
|
(feat) router: init client for OpenAI compatible providers
|
2023-11-28 17:49:53 -08:00 |
|
Krrish Dholakia
|
bb1267eb07
|
fix(router.py): fix exponential backoff to use retry-after if present in headers
|
2023-11-28 17:25:03 -08:00 |
|
ishaan-jaff
|
f5b558dde0
|
(fix) router red api_key, api_base, api_version
|
2023-11-28 17:10:20 -08:00 |
|
ishaan-jaff
|
282b9a37e5
|
(fix) router: passing client
|
2023-11-28 16:34:16 -08:00 |
|
ishaan-jaff
|
4d06c296e3
|
(router) re use client across requests
|
2023-11-28 16:21:16 -08:00 |
|
ishaan-jaff
|
94d35f1ec5
|
(feat) router: re-use the same client for high trafic
|
2023-11-28 15:44:56 -08:00 |
|
ishaan-jaff
|
2a69cab550
|
(feat) router track total, success, failed calls per model
|
2023-11-28 15:44:56 -08:00 |
|
Krrish Dholakia
|
094144de58
|
fix(router.py): removing model id before making call
|
2023-11-28 10:09:45 -08:00 |
|
Krrish Dholakia
|
6149642295
|
refactor(router.py): fix linting errors
|
2023-11-27 22:11:53 -08:00 |
|
Krrish Dholakia
|
82d79638d4
|
refactor(router.py): fix linting errors
|
2023-11-27 22:08:48 -08:00 |
|
Krrish Dholakia
|
c4aea7432f
|
build: adding debug logs to gitignore
|
2023-11-27 22:05:07 -08:00 |
|
Krrish Dholakia
|
be9fa06da6
|
fix(main.py): fix linting errors
|
2023-11-27 19:11:38 -08:00 |
|
ishaan-jaff
|
50733363ee
|
(feat) use api_base, api_key as model
|
2023-11-27 18:08:47 -08:00 |
|
Krrish Dholakia
|
04f745e314
|
fix(router.py): speed improvements to the router
|
2023-11-27 17:35:26 -08:00 |
|
ishaan-jaff
|
4265f9b2ef
|
(fix) router: allow same model/name
|
2023-11-27 16:26:09 -08:00 |
|
Krrish Dholakia
|
59ba1560e5
|
fix(router.py): fix fallbacks
|
2023-11-25 19:34:20 -08:00 |
|
Krrish Dholakia
|
fa713abfc3
|
fix(router.py): check for fallbacks in completion params for router
|
2023-11-25 18:46:45 -08:00 |
|
Krrish Dholakia
|
e4f302a8e2
|
fix(proxy_server.py): expose a /health endpoint
|
2023-11-25 18:28:47 -08:00 |
|
Krrish Dholakia
|
ab0bc87427
|
fix(router.py): check if fallbacks is none
|
2023-11-25 14:58:07 -08:00 |
|
Krrish Dholakia
|
95579fda7d
|
fix(utils.py): fix bedrock + cohere calls
|
2023-11-25 14:45:42 -08:00 |
|
Krrish Dholakia
|
d62da29cbe
|
fix: fix linting issues
|
2023-11-24 15:46:25 -08:00 |
|
Krrish Dholakia
|
2686894823
|
fix(router.py): fix retry logic
|
2023-11-24 13:27:44 -08:00 |
|
Krrish Dholakia
|
16e1070dbe
|
test: refactor testing order
|
2023-11-24 12:47:28 -08:00 |
|
Krrish Dholakia
|
12dbdc4c15
|
docs(simple_proxy.md): add tutorial for doing fallbacks + retries + timeouts on the proxy
|
2023-11-24 12:20:38 -08:00 |
|
Krrish Dholakia
|
5c18771f9d
|
fix(router.py): fixing embedding call
|
2023-11-23 21:07:02 -08:00 |
|
Krrish Dholakia
|
02464f6661
|
fix(router.py): use an older version of async for compatibility
|
2023-11-23 21:00:53 -08:00 |
|
Krrish Dholakia
|
187403c5cc
|
fix(router.py): add modelgroup to call metadata
|
2023-11-23 20:55:49 -08:00 |
|
Krrish Dholakia
|
7d221fe863
|
fix(utils.py): make failure logging sync
|
2023-11-23 20:19:27 -08:00 |
|
Krrish Dholakia
|
dc17f63d0b
|
fix(router.py): fix linting errors
|
2023-11-23 16:50:19 -08:00 |
|
Krrish Dholakia
|
c273d6f0d6
|
fix(router.py): add support for context window fallbacks on router
|
2023-11-23 16:43:02 -08:00 |
|
ishaan-jaff
|
f01865e960
|
(fix) router
|
2023-11-23 16:28:19 -08:00 |
|
Krrish Dholakia
|
afac42e93a
|
fix(router.py): enable async completions with model fallbacks
|
2023-11-23 16:15:57 -08:00 |
|
Krrish Dholakia
|
8ac03e492f
|
fix(router.py): enable fallbacks for sync completions
|
2023-11-23 16:06:46 -08:00 |
|
Krrish Dholakia
|
8c4e8d6c62
|
feat(proxy_server.py): add in-memory caching for user api keys
|
2023-11-23 13:21:45 -08:00 |
|
Krrish Dholakia
|
61fc76a8c4
|
fix(router.py): fix caching for tracking cooldowns + usage
|
2023-11-23 11:13:32 -08:00 |
|
Krrish Dholakia
|
2f93c0155a
|
fix: fix linting errors
|
2023-11-22 19:59:25 -08:00 |
|
Krrish Dholakia
|
5d5ca9f7ef
|
fix(router.py): add support for cooldowns with redis
|
2023-11-22 19:54:22 -08:00 |
|
Krrish Dholakia
|
3e76d4b422
|
feat(router.py): add server cooldown logic
|
2023-11-22 15:59:48 -08:00 |
|