Commit graph

2876 commits

Author SHA1 Message Date
Krrish Dholakia
402d59d0ff fix(lowest_tpm_rpm_routing.py): broaden scope of get deployment logic 2023-12-30 13:27:50 +05:30
Krrish Dholakia
7be5f74b70 fix(aimage_generation): fix response type 2023-12-30 12:53:24 +05:30
Krrish Dholakia
4d239f1e65 fix(openai.py): fix async image gen call 2023-12-30 12:44:54 +05:30
Krrish Dholakia
11b039193a test(test_least_busy_routing.py): fix test init 2023-12-30 12:39:13 +05:30
ishaan-jaff
31bdcb48af (fix) use cloudflare optional params 2023-12-30 12:22:31 +05:30
Krrish Dholakia
b69ffb3738 fix: support dynamic timeouts for openai and azure 2023-12-30 12:14:02 +05:30
Krrish Dholakia
7d55a563ee fix(main.py): don't set timeout as an optional api param 2023-12-30 11:47:07 +05:30
ishaan-jaff
040c127104 (fix) batch_completions - set default timeout 2023-12-30 11:35:55 +05:30
Krrish Dholakia
e1925d0e29 fix(router.py): support retry and fallbacks for atext_completion 2023-12-30 11:19:32 +05:30
ishaan-jaff
fa4a533e91 (fix) timeout optional param 2023-12-30 11:07:52 +05:30
ishaan-jaff
3b99a2cffa (test) dynamic timeout on router 2023-12-30 10:56:07 +05:30
ishaan-jaff
d5cbef4e36 (feat) proxy - support dynamic timeout per request 2023-12-30 10:55:42 +05:30
ishaan-jaff
b5e819300b (feat) router, add ModelResponse type hints 2023-12-30 10:44:13 +05:30
Krrish Dholakia
a11940f4eb fix(router.py): handle initial scenario for tpm/rpm routing 2023-12-30 07:28:45 +05:30
Marmik Pandya
1faad4b0c1 add support for mistral json mode via anyscale 2023-12-29 22:26:22 +05:30
Krrish Dholakia
1933d44cbd fix(router.py): fix int logic 2023-12-29 20:41:56 +05:30
Krrish Dholakia
a30f00276b refactor(lowest_tpm_rpm.py): move tpm/rpm based routing to a separate file for better testing 2023-12-29 18:33:43 +05:30
Krrish Dholakia
3fa1bb9f08 test(test_least_busy_router.py): add better testing for least busy routing 2023-12-29 17:16:00 +05:30
Krrish Dholakia
ffe2350428 fix(least_busy.py): support consistent use of model id instead of deployment name 2023-12-29 17:05:26 +05:30
ishaan-jaff
6fac130e30 (test) gemini-pro-vision cost tracking 2023-12-29 16:31:28 +05:30
ishaan-jaff
224d38ba48 (fix) vertex ai - use usage from response 2023-12-29 16:30:25 +05:30
ishaan-jaff
7afc022ad3 (fix) counting streaming prompt tokens - azure 2023-12-29 16:13:52 +05:30
ishaan-jaff
4f832bce52 (fix) token_counter for tool calling 2023-12-29 15:54:03 +05:30
ishaan-jaff
ac19302d0d (test) stream chunk builder - azure prompt tokens 2023-12-29 15:45:41 +05:30
ishaan-jaff
95a98f5463 (test) test_token_counter_azure 2023-12-29 15:37:46 +05:30
ishaan-jaff
806551ff99 (fix) use openai token counter for azure llms 2023-12-29 15:37:46 +05:30
Krrish Dholakia
90ed04d992 fix(router.py): support wait_for for async completion calls 2023-12-29 15:27:20 +05:30
Krrish Dholakia
b9dd46de6c feat(router.py): support 'retry_after' param, to set min timeout before retrying a failed request (default 0) 2023-12-29 15:18:28 +05:30
ishaan-jaff
a20331e47a (test) token_counter - prompt tokens == tokens from API 2023-12-29 15:15:39 +05:30
ishaan-jaff
70376d3a4f (feat) azure stream - count correct prompt tokens 2023-12-29 15:15:39 +05:30
Krrish Dholakia
3c50177314 fix(caching.py): hash the cache key to prevent key too long errors 2023-12-29 15:03:33 +05:30
Krrish Dholakia
f0aec09c8a docs(load_test.md): add litellm load test script to docs 2023-12-29 13:41:44 +05:30
ishaan-jaff
8475fddc78 (feat) cloudflare - add exception mapping 2023-12-29 12:31:10 +05:30
ishaan-jaff
a21f135fff (test) async + stream clooudflare 2023-12-29 12:03:29 +05:30
ishaan-jaff
27f8598867 (feat) add cloudflare streaming 2023-12-29 12:01:26 +05:30
ishaan-jaff
b7539df9b3 (test) async cloudflare 2023-12-29 11:50:09 +05:30
ishaan-jaff
c69f4f17a5 (feat) cloudflare - add optional params 2023-12-29 11:50:09 +05:30
Krrish Dholakia
9cf43cd5dc refactor: move async text completion testing to test_text_completion.py 2023-12-29 11:46:40 +05:30
ishaan-jaff
4425368065 (test) test cloudflare completion 2023-12-29 11:34:58 +05:30
ishaan-jaff
b990fc8324 (feat) cloudflare ai workers - add completion support 2023-12-29 11:34:58 +05:30
Krrish Dholakia
a88f07dc60 fix(main.py): fix async text completion streaming + add new tests 2023-12-29 11:33:42 +05:30
ishaan-jaff
e212afc46c (ci/cd) set num retries for HF test 2023-12-29 10:52:45 +05:30
ishaan-jaff
796e735881 (feat) v0 adding cloudflare 2023-12-29 09:32:29 +05:30
ishaan-jaff
5d31bea9e0 (fix) tg AI cost tracking - zero-one-ai/Yi-34B-Chat 2023-12-29 09:14:07 +05:30
ishaan-jaff
362bed6ca3 (fix) together_ai cost tracking 2023-12-28 22:11:08 +05:30
Krrish Dholakia
5a48dac83f fix(vertex_ai.py): support function calling for gemini 2023-12-28 19:07:04 +05:30
ishaan-jaff
2a147579ec (feat) add voyage ai embeddings 2023-12-28 17:10:15 +05:30
Krrish Dholakia
8188475c16 feat(admin_ui.py): support creating keys on admin ui 2023-12-28 16:59:11 +05:30
ishaan-jaff
86afc399c4 (test) mistral-embed 2023-12-28 16:42:36 +05:30
ishaan-jaff
12c6a00938 (feat) add mistral api embeddings 2023-12-28 16:41:55 +05:30