Commit graph

443 commits

Author SHA1 Message Date
Krrish Dholakia
a019fd05e3 fix(router.py): fix should_retry logic for authentication errors 2024-06-03 13:12:00 -07:00
Ishaan Jaff
2ce5dc0dfd ci/cd run again 2024-06-01 21:19:32 -07:00
Ishaan Jaff
309a66692f fix test_rate_limit[usage-based-routing-True-3-2] 2024-06-01 21:18:23 -07:00
Ishaan Jaff
373a41ca6d fix async_function_with_retries 2024-06-01 19:00:22 -07:00
Ishaan Jaff
054456c50e
Merge pull request #3963 from BerriAI/litellm_set_allowed_fail_policy
[FEAT]- set custom AllowedFailsPolicy on litellm.Router
2024-06-01 17:57:11 -07:00
Ishaan Jaff
fb49d036fb
Merge pull request #3962 from BerriAI/litellm_return_num_rets_max_exceptions
[Feat] return `num_retries` and `max_retries` in exceptions
2024-06-01 17:48:38 -07:00
Ishaan Jaff
d4378143f1 fix current_attempt, num_retries not defined 2024-06-01 17:42:37 -07:00
Ishaan Jaff
eb203c051a feat - set custom AllowedFailsPolicy 2024-06-01 17:26:21 -07:00
Ishaan Jaff
cfc55b39a9 fix - return in LITELLM_EXCEPTION_TYPES 2024-06-01 17:05:33 -07:00
Ishaan Jaff
286d42a881 feat - add num retries and max retries in exception 2024-06-01 16:53:00 -07:00
Krrish Dholakia
7715267989 fix(router.py): simplify scheduler
move the scheduler poll queuing logic into the router class, making it easier to use
2024-06-01 16:09:57 -07:00
Krish Dholakia
8375e9621c
Merge pull request #3954 from BerriAI/litellm_simple_request_prioritization
feat(scheduler.py): add request prioritization scheduler
2024-05-31 23:29:09 -07:00
Krrish Dholakia
381247a095 fix(router.py): fix param 2024-05-31 21:52:23 -07:00
Krrish Dholakia
e49325b234 fix(router.py): fix cooldown logic for usage-based-routing-v2 pre-call-checks 2024-05-31 21:32:01 -07:00
Krish Dholakia
08bae3185a
Merge pull request #3936 from BerriAI/litellm_assistants_api_proxy
feat(proxy_server.py): add assistants api endpoints to proxy server
2024-05-31 18:43:22 -07:00
Ishaan Jaff
f569e61638 fix - model hub supported_openai_params 2024-05-31 07:27:21 -07:00
Krrish Dholakia
e2b34165e7 feat(proxy_server.py): add assistants api endpoints to proxy server 2024-05-30 22:44:43 -07:00
Krish Dholakia
d3a247bf20
Merge pull request #3928 from BerriAI/litellm_audio_speech_endpoint
feat(main.py): support openai tts endpoint
2024-05-30 17:30:42 -07:00
Krrish Dholakia
93166cdabf fix(openai.py): fix openai response for /audio/speech endpoint 2024-05-30 16:41:06 -07:00
Krrish Dholakia
32bfb685f5 fix(router.py): cooldown on 404 errors
https://github.com/BerriAI/litellm/issues/3884
2024-05-30 10:57:38 -07:00
Krrish Dholakia
1d18ca6a7d fix(router.py): security fix - don't show api key in invalid model setup error message 2024-05-29 16:14:57 -07:00
Krish Dholakia
e838bd1c79
Merge branch 'main' into litellm_batch_completions 2024-05-28 22:38:05 -07:00
Ishaan Jaff
9ab96e12ed fix - update abatch_completion docstring 2024-05-28 22:27:09 -07:00
Ishaan Jaff
473ec66b84 feat - router add abatch_completion 2024-05-28 22:19:33 -07:00
Krrish Dholakia
e3000504f9 fix(router.py): support batch completions fastest response streaming 2024-05-28 21:51:09 -07:00
Krrish Dholakia
1ebae6e7b0 fix(router.py): support comma-separated model list for batch completion fastest response 2024-05-28 21:34:37 -07:00
Krrish Dholakia
20106715d5 feat(proxy_server.py): enable batch completion fastest response calls on proxy
introduces new `fastest_response` flag for enabling the call
2024-05-28 20:09:31 -07:00
Krrish Dholakia
ecd182eb6a feat(router.py): support fastest response batch completion call
returns fastest response. cancels others.
2024-05-28 19:44:41 -07:00
Krish Dholakia
37f11162d2
Merge pull request #3847 from paneru-rajan/improve-validate-fallback-method
Improve validate-fallbacks method
2024-05-27 18:18:35 -07:00
Krrish Dholakia
67da24f144 fix(fix-'get_model_group_info'-to-return-a-default-value-if-unmapped-model-group): allows model hub to return all model groupss 2024-05-27 13:53:01 -07:00
Ishaan Jaff
b5f883ab74 feat - show openai params on model hub ui 2024-05-27 08:49:51 -07:00
Krrish Dholakia
22b6b99b34 feat(proxy_server.py): expose new /model_group/info endpoint
returns model-group level info on supported params, max tokens, pricing, etc.
2024-05-26 14:07:35 -07:00
sujan100000
e02328c9f1 Improve validate-fallbacks method
* No need to check for fallback_params length
* Instead of asserting, used if condition and raised valueError
* Improved Error message
2024-05-26 19:09:07 +09:30
Krrish Dholakia
ae787645da fix(router.py): fix pre call check
only check if response_format supported by model, if pre-call check enabled
2024-05-24 20:09:15 -07:00
Krrish Dholakia
8dec87425e feat(slack_alerting.py): refactor region outage alerting to do model based alerting instead
Unable to extract azure region from api base, makes sense to start with model alerting and then move to region
2024-05-24 19:10:33 -07:00
Ishaan Jaff
7d2c76f640 fix test_filter_invalid_params_pre_call_check 2024-05-23 21:16:32 -07:00
Krrish Dholakia
f04e4b921b feat(ui/model_dashboard.tsx): add databricks models via admin ui 2024-05-23 20:28:54 -07:00
Krrish Dholakia
988970f4c2 feat(router.py): Fixes https://github.com/BerriAI/litellm/issues/3769 2024-05-21 17:24:51 -07:00
Krish Dholakia
2cda5a2bc3
Merge pull request #3412 from sumanth13131/usage-based-routing-ttl-on-cache
usage-based-routing-ttl-on-cache
2024-05-21 07:58:41 -07:00
Ishaan Jaff
92a4df00d4 fix add doc string for abatch_completion_one_model_multiple_requests 2024-05-20 17:51:08 -07:00
Ishaan Jaff
5be966dc09 feat - add abatch_completion_one_model_multiple_requests 2024-05-20 17:47:25 -07:00
Ishaan Jaff
8413fdf4c7
Merge branch 'main' into litellm_standardize_slack_exception_msg_format 2024-05-20 16:39:41 -07:00
Ishaan Jaff
f11de863f6 fix - standardize format of exceptions occuring on slack alerts 2024-05-20 16:29:16 -07:00
Ishaan Jaff
6368d5a725 feat - read cooldown time from exception header 2024-05-17 18:50:33 -07:00
David Manouchehri
50accc327c
Fix(router.py): Kill a bug that forced Azure OpenAI to have an API key, even though we can use OIDC instead. 2024-05-17 00:37:56 +00:00
Ishaan Jaff
d16a6c03a2 feat - include model name in cool down alerts 2024-05-16 12:52:15 -07:00
Ishaan Jaff
848561a8a7 fix - router show better client side errors 2024-05-16 09:01:27 -07:00
Krrish Dholakia
d9ad7c6218 fix(router.py): fix validation error for default fallback 2024-05-15 13:23:00 -07:00
Krrish Dholakia
dba713ea43 fix(router.py): add validation for how router fallbacks are setup
prevent user errors
2024-05-15 10:44:16 -07:00
Ishaan Jaff
f17f0a09d8 feat - router use _is_cooldown_required 2024-05-15 10:03:55 -07:00