Commit graph

1376 commits

Author SHA1 Message Date
ishaan-jaff
7458957d8e (test) xinference on litellm router 2024-01-02 16:51:08 +05:30
ishaan-jaff
1f4a328131 (test) xinference embeddings 2024-01-02 15:41:51 +05:30
Krrish Dholakia
d4da63800e fix(utils.py): support token counting for gpt-4-vision models 2024-01-02 14:41:42 +05:30
ishaan-jaff
4db20ef478 (test) proxy - pass user_config 2024-01-02 14:15:03 +05:30
Krrish Dholakia
de1d51e3de fix(lowest_tpm_rpm.py): handle null case for text/message input 2024-01-02 12:24:29 +05:30
ishaan-jaff
7fca998026 (test) proxy - use, user provided model_list 2024-01-02 12:10:34 +05:30
Krrish Dholakia
01c042fdc6 feat(router.py): add support for retry/fallbacks for async embedding calls 2024-01-02 11:54:28 +05:30
Krrish Dholakia
4f988058a1 refactor(test_router_caching.py): move tpm/rpm routing tests to separate file 2024-01-02 11:10:11 +05:30
ishaan-jaff
163b78fef1 (test) bedrock-test passing boto3 client 2024-01-02 10:23:28 +05:30
Ishaan Jaff
0e870c2746 (test) fix test_get_model_cost_map.py 2024-01-01 21:58:48 +05:30
Krrish Dholakia
4eae0c9a0d fix(router.py): correctly raise no model available error
https://github.com/BerriAI/litellm/issues/1289
2024-01-01 21:22:42 +05:30
ishaan-jaff
988ce6ae6d (test) ci/cd 2024-01-01 13:51:27 +05:30
ishaan-jaff
d908ec7e44 (test) langfuse - set custom trace_id 2023-12-30 20:19:22 +05:30
ishaan-jaff
8d6a805312 (test) caching - context managers 2023-12-30 19:33:47 +05:30
Krrish Dholakia
d3dee9b20c test(test_lowest_latency_routing.py): add more tests 2023-12-30 17:41:42 +05:30
Krrish Dholakia
25ee96271e fix(router.py): fix latency based routing 2023-12-30 17:25:40 +05:30
Krrish Dholakia
2b56daae0d test(test_router_init.py): fix test router init 2023-12-30 16:51:39 +05:30
Krrish Dholakia
30c9c91520 test(test_least_busy_routing.py): fix test 2023-12-30 16:12:52 +05:30
Krrish Dholakia
1ed96b1fc8 test(test_router.py): add retries 2023-12-30 15:54:46 +05:30
Krrish Dholakia
2cea8b0e83 fix(router.py): periodically re-initialize azure/openai clients to solve max conn issue 2023-12-30 15:48:34 +05:30
Krrish Dholakia
402d59d0ff fix(lowest_tpm_rpm_routing.py): broaden scope of get deployment logic 2023-12-30 13:27:50 +05:30
Krrish Dholakia
11b039193a test(test_least_busy_routing.py): fix test init 2023-12-30 12:39:13 +05:30
Krrish Dholakia
b69ffb3738 fix: support dynamic timeouts for openai and azure 2023-12-30 12:14:02 +05:30
Krrish Dholakia
7d55a563ee fix(main.py): don't set timeout as an optional api param 2023-12-30 11:47:07 +05:30
Krrish Dholakia
e1925d0e29 fix(router.py): support retry and fallbacks for atext_completion 2023-12-30 11:19:32 +05:30
ishaan-jaff
3b99a2cffa (test) dynamic timeout on router 2023-12-30 10:56:07 +05:30
Krrish Dholakia
a11940f4eb fix(router.py): handle initial scenario for tpm/rpm routing 2023-12-30 07:28:45 +05:30
Krrish Dholakia
1933d44cbd fix(router.py): fix int logic 2023-12-29 20:41:56 +05:30
Krrish Dholakia
a30f00276b refactor(lowest_tpm_rpm.py): move tpm/rpm based routing to a separate file for better testing 2023-12-29 18:33:43 +05:30
Krrish Dholakia
3fa1bb9f08 test(test_least_busy_router.py): add better testing for least busy routing 2023-12-29 17:16:00 +05:30
Krrish Dholakia
ffe2350428 fix(least_busy.py): support consistent use of model id instead of deployment name 2023-12-29 17:05:26 +05:30
ishaan-jaff
6fac130e30 (test) gemini-pro-vision cost tracking 2023-12-29 16:31:28 +05:30
ishaan-jaff
7afc022ad3 (fix) counting streaming prompt tokens - azure 2023-12-29 16:13:52 +05:30
ishaan-jaff
ac19302d0d (test) stream chunk builder - azure prompt tokens 2023-12-29 15:45:41 +05:30
ishaan-jaff
95a98f5463 (test) test_token_counter_azure 2023-12-29 15:37:46 +05:30
ishaan-jaff
a20331e47a (test) token_counter - prompt tokens == tokens from API 2023-12-29 15:15:39 +05:30
Krrish Dholakia
3c50177314 fix(caching.py): hash the cache key to prevent key too long errors 2023-12-29 15:03:33 +05:30
Krrish Dholakia
f0aec09c8a docs(load_test.md): add litellm load test script to docs 2023-12-29 13:41:44 +05:30
ishaan-jaff
a21f135fff (test) async + stream clooudflare 2023-12-29 12:03:29 +05:30
ishaan-jaff
b7539df9b3 (test) async cloudflare 2023-12-29 11:50:09 +05:30
Krrish Dholakia
9cf43cd5dc refactor: move async text completion testing to test_text_completion.py 2023-12-29 11:46:40 +05:30
ishaan-jaff
4425368065 (test) test cloudflare completion 2023-12-29 11:34:58 +05:30
Krrish Dholakia
a88f07dc60 fix(main.py): fix async text completion streaming + add new tests 2023-12-29 11:33:42 +05:30
ishaan-jaff
e212afc46c (ci/cd) set num retries for HF test 2023-12-29 10:52:45 +05:30
Krrish Dholakia
5a48dac83f fix(vertex_ai.py): support function calling for gemini 2023-12-28 19:07:04 +05:30
ishaan-jaff
2a147579ec (feat) add voyage ai embeddings 2023-12-28 17:10:15 +05:30
ishaan-jaff
86afc399c4 (test) mistral-embed 2023-12-28 16:42:36 +05:30
Krrish Dholakia
9f056c91bd test(test_proxy_custom_logger.py): fix testing to handle [done] chunks 2023-12-28 11:37:57 +05:30
Krrish Dholakia
507b6bf96e fix(utils.py): use local tiktoken copy 2023-12-28 11:22:33 +05:30
ishaan-jaff
090bc361d8 (ci/cd) run render deploy 2023-12-28 11:16:58 +05:30