Commit graph

3232 commits

Author SHA1 Message Date
Krrish Dholakia
f2d0d5584a fix(router.py): fix latency based routing 2023-12-30 17:25:40 +05:30
Krrish Dholakia
c41b1418d4 test(test_router_init.py): fix test router init 2023-12-30 16:51:39 +05:30
Krrish Dholakia
3cb7acceaa test(test_least_busy_routing.py): fix test 2023-12-30 16:12:52 +05:30
Krrish Dholakia
3935f99083 test(test_router.py): add retries 2023-12-30 15:54:46 +05:30
Krrish Dholakia
69935db239 fix(router.py): periodically re-initialize azure/openai clients to solve max conn issue 2023-12-30 15:48:34 +05:30
Krrish Dholakia
b66cf0aa43 fix(lowest_tpm_rpm_routing.py): broaden scope of get deployment logic 2023-12-30 13:27:50 +05:30
Krrish Dholakia
a6719caebd fix(aimage_generation): fix response type 2023-12-30 12:53:24 +05:30
Krrish Dholakia
750432457b fix(openai.py): fix async image gen call 2023-12-30 12:44:54 +05:30
Krrish Dholakia
2acd086596 test(test_least_busy_routing.py): fix test init 2023-12-30 12:39:13 +05:30
ishaan-jaff
535a547b66 (fix) use cloudflare optional params 2023-12-30 12:22:31 +05:30
Krrish Dholakia
c33c1d85bb fix: support dynamic timeouts for openai and azure 2023-12-30 12:14:02 +05:30
Krrish Dholakia
77be3e3114 fix(main.py): don't set timeout as an optional api param 2023-12-30 11:47:07 +05:30
ishaan-jaff
aee38d9329 (fix) batch_completions - set default timeout 2023-12-30 11:35:55 +05:30
Krrish Dholakia
38f55249e1 fix(router.py): support retry and fallbacks for atext_completion 2023-12-30 11:19:32 +05:30
ishaan-jaff
5d6954895f (fix) timeout optional param 2023-12-30 11:07:52 +05:30
ishaan-jaff
523415cb0c (test) dynamic timeout on router 2023-12-30 10:56:07 +05:30
ishaan-jaff
2f4cd3b569 (feat) proxy - support dynamic timeout per request 2023-12-30 10:55:42 +05:30
ishaan-jaff
459ba5b45e (feat) router, add ModelResponse type hints 2023-12-30 10:44:13 +05:30
Krrish Dholakia
a34de56289 fix(router.py): handle initial scenario for tpm/rpm routing 2023-12-30 07:28:45 +05:30
Marmik Pandya
1426594d3f add support for mistral json mode via anyscale 2023-12-29 22:26:22 +05:30
Krrish Dholakia
2fc264ca04 fix(router.py): fix int logic 2023-12-29 20:41:56 +05:30
Krrish Dholakia
cf91e49c87 refactor(lowest_tpm_rpm.py): move tpm/rpm based routing to a separate file for better testing 2023-12-29 18:33:43 +05:30
Krrish Dholakia
54d7bc2cc3 test(test_least_busy_router.py): add better testing for least busy routing 2023-12-29 17:16:00 +05:30
Krrish Dholakia
678bbfa9be fix(least_busy.py): support consistent use of model id instead of deployment name 2023-12-29 17:05:26 +05:30
ishaan-jaff
06e4b301b4 (test) gemini-pro-vision cost tracking 2023-12-29 16:31:28 +05:30
ishaan-jaff
739d9e7a78 (fix) vertex ai - use usage from response 2023-12-29 16:30:25 +05:30
ishaan-jaff
e6a7212d10 (fix) counting streaming prompt tokens - azure 2023-12-29 16:13:52 +05:30
ishaan-jaff
8c03be59a8 (fix) token_counter for tool calling 2023-12-29 15:54:03 +05:30
ishaan-jaff
73f60b7315 (test) stream chunk builder - azure prompt tokens 2023-12-29 15:45:41 +05:30
ishaan-jaff
b1077ebc38 (test) test_token_counter_azure 2023-12-29 15:37:46 +05:30
ishaan-jaff
037dcbbe10 (fix) use openai token counter for azure llms 2023-12-29 15:37:46 +05:30
Krrish Dholakia
cbdfae1267 fix(router.py): support wait_for for async completion calls 2023-12-29 15:27:20 +05:30
Krrish Dholakia
4882325c35 feat(router.py): support 'retry_after' param, to set min timeout before retrying a failed request (default 0) 2023-12-29 15:18:28 +05:30
ishaan-jaff
4a028d012a (test) token_counter - prompt tokens == tokens from API 2023-12-29 15:15:39 +05:30
ishaan-jaff
a300ab9152 (feat) azure stream - count correct prompt tokens 2023-12-29 15:15:39 +05:30
Krrish Dholakia
1e07f0fce8 fix(caching.py): hash the cache key to prevent key too long errors 2023-12-29 15:03:33 +05:30
Krrish Dholakia
6e68cd1125 docs(load_test.md): add litellm load test script to docs 2023-12-29 13:41:44 +05:30
ishaan-jaff
3973b9c8e4 (feat) cloudflare - add exception mapping 2023-12-29 12:31:10 +05:30
ishaan-jaff
243ad31e90 (test) async + stream clooudflare 2023-12-29 12:03:29 +05:30
ishaan-jaff
ee682be093 (feat) add cloudflare streaming 2023-12-29 12:01:26 +05:30
ishaan-jaff
a999e80b46 (test) async cloudflare 2023-12-29 11:50:09 +05:30
ishaan-jaff
dde6bc4fb6 (feat) cloudflare - add optional params 2023-12-29 11:50:09 +05:30
Krrish Dholakia
e06840b571 refactor: move async text completion testing to test_text_completion.py 2023-12-29 11:46:40 +05:30
ishaan-jaff
5fc9524a46 (test) test cloudflare completion 2023-12-29 11:34:58 +05:30
ishaan-jaff
8fcfb7df22 (feat) cloudflare ai workers - add completion support 2023-12-29 11:34:58 +05:30
Krrish Dholakia
6f2734100f fix(main.py): fix async text completion streaming + add new tests 2023-12-29 11:33:42 +05:30
ishaan-jaff
2b8e2bd937 (ci/cd) set num retries for HF test 2023-12-29 10:52:45 +05:30
ishaan-jaff
367e9913dc (feat) v0 adding cloudflare 2023-12-29 09:32:29 +05:30
ishaan-jaff
daf32f3bd4 (fix) tg AI cost tracking - zero-one-ai/Yi-34B-Chat 2023-12-29 09:14:07 +05:30
ishaan-jaff
d79df3a1e9 (fix) together_ai cost tracking 2023-12-28 22:11:08 +05:30