Commit graph

681 commits

Author SHA1 Message Date
Krrish Dholakia
381247a095 fix(router.py): fix param 2024-05-31 21:52:23 -07:00
Krrish Dholakia
e49325b234 fix(router.py): fix cooldown logic for usage-based-routing-v2 pre-call-checks 2024-05-31 21:32:01 -07:00
Krish Dholakia
08bae3185a
Merge pull request #3936 from BerriAI/litellm_assistants_api_proxy
feat(proxy_server.py): add assistants api endpoints to proxy server
2024-05-31 18:43:22 -07:00
Ishaan Jaff
f569e61638 fix - model hub supported_openai_params 2024-05-31 07:27:21 -07:00
Krrish Dholakia
e2b34165e7 feat(proxy_server.py): add assistants api endpoints to proxy server 2024-05-30 22:44:43 -07:00
Krish Dholakia
d3a247bf20
Merge pull request #3928 from BerriAI/litellm_audio_speech_endpoint
feat(main.py): support openai tts endpoint
2024-05-30 17:30:42 -07:00
Krrish Dholakia
93166cdabf fix(openai.py): fix openai response for /audio/speech endpoint 2024-05-30 16:41:06 -07:00
Krrish Dholakia
32bfb685f5 fix(router.py): cooldown on 404 errors
https://github.com/BerriAI/litellm/issues/3884
2024-05-30 10:57:38 -07:00
Krrish Dholakia
1d18ca6a7d fix(router.py): security fix - don't show api key in invalid model setup error message 2024-05-29 16:14:57 -07:00
Krish Dholakia
e838bd1c79
Merge branch 'main' into litellm_batch_completions 2024-05-28 22:38:05 -07:00
Ishaan Jaff
9ab96e12ed fix - update abatch_completion docstring 2024-05-28 22:27:09 -07:00
Ishaan Jaff
473ec66b84 feat - router add abatch_completion 2024-05-28 22:19:33 -07:00
Krrish Dholakia
e3000504f9 fix(router.py): support batch completions fastest response streaming 2024-05-28 21:51:09 -07:00
Krrish Dholakia
1ebae6e7b0 fix(router.py): support comma-separated model list for batch completion fastest response 2024-05-28 21:34:37 -07:00
Krrish Dholakia
20106715d5 feat(proxy_server.py): enable batch completion fastest response calls on proxy
introduces new `fastest_response` flag for enabling the call
2024-05-28 20:09:31 -07:00
Krrish Dholakia
ecd182eb6a feat(router.py): support fastest response batch completion call
returns fastest response. cancels others.
2024-05-28 19:44:41 -07:00
Krish Dholakia
37f11162d2
Merge pull request #3847 from paneru-rajan/improve-validate-fallback-method
Improve validate-fallbacks method
2024-05-27 18:18:35 -07:00
Krrish Dholakia
67da24f144 fix(fix-'get_model_group_info'-to-return-a-default-value-if-unmapped-model-group): allows model hub to return all model groupss 2024-05-27 13:53:01 -07:00
Ishaan Jaff
b5f883ab74 feat - show openai params on model hub ui 2024-05-27 08:49:51 -07:00
Krrish Dholakia
22b6b99b34 feat(proxy_server.py): expose new /model_group/info endpoint
returns model-group level info on supported params, max tokens, pricing, etc.
2024-05-26 14:07:35 -07:00
sujan100000
e02328c9f1 Improve validate-fallbacks method
* No need to check for fallback_params length
* Instead of asserting, used if condition and raised valueError
* Improved Error message
2024-05-26 19:09:07 +09:30
Krrish Dholakia
ae787645da fix(router.py): fix pre call check
only check if response_format supported by model, if pre-call check enabled
2024-05-24 20:09:15 -07:00
Krrish Dholakia
8dec87425e feat(slack_alerting.py): refactor region outage alerting to do model based alerting instead
Unable to extract azure region from api base, makes sense to start with model alerting and then move to region
2024-05-24 19:10:33 -07:00
Ishaan Jaff
7d2c76f640 fix test_filter_invalid_params_pre_call_check 2024-05-23 21:16:32 -07:00
Krrish Dholakia
f04e4b921b feat(ui/model_dashboard.tsx): add databricks models via admin ui 2024-05-23 20:28:54 -07:00
Krrish Dholakia
988970f4c2 feat(router.py): Fixes https://github.com/BerriAI/litellm/issues/3769 2024-05-21 17:24:51 -07:00
Krish Dholakia
2cda5a2bc3
Merge pull request #3412 from sumanth13131/usage-based-routing-ttl-on-cache
usage-based-routing-ttl-on-cache
2024-05-21 07:58:41 -07:00
Ishaan Jaff
92a4df00d4 fix add doc string for abatch_completion_one_model_multiple_requests 2024-05-20 17:51:08 -07:00
Ishaan Jaff
5be966dc09 feat - add abatch_completion_one_model_multiple_requests 2024-05-20 17:47:25 -07:00
Ishaan Jaff
8413fdf4c7
Merge branch 'main' into litellm_standardize_slack_exception_msg_format 2024-05-20 16:39:41 -07:00
Ishaan Jaff
f11de863f6 fix - standardize format of exceptions occuring on slack alerts 2024-05-20 16:29:16 -07:00
Ishaan Jaff
6368d5a725 feat - read cooldown time from exception header 2024-05-17 18:50:33 -07:00
David Manouchehri
50accc327c
Fix(router.py): Kill a bug that forced Azure OpenAI to have an API key, even though we can use OIDC instead. 2024-05-17 00:37:56 +00:00
Ishaan Jaff
d16a6c03a2 feat - include model name in cool down alerts 2024-05-16 12:52:15 -07:00
Ishaan Jaff
848561a8a7 fix - router show better client side errors 2024-05-16 09:01:27 -07:00
Krrish Dholakia
d9ad7c6218 fix(router.py): fix validation error for default fallback 2024-05-15 13:23:00 -07:00
Krrish Dholakia
dba713ea43 fix(router.py): add validation for how router fallbacks are setup
prevent user errors
2024-05-15 10:44:16 -07:00
Ishaan Jaff
f17f0a09d8 feat - router use _is_cooldown_required 2024-05-15 10:03:55 -07:00
Ishaan Jaff
52f8c39bbf feat - don't cooldown deployment on BadRequestError 2024-05-15 09:03:27 -07:00
Krrish Dholakia
151902f7d9 fix(router.py): error string fix 2024-05-14 11:20:57 -07:00
Krrish Dholakia
7557b3e2ff fix(init.py): set 'default_fallbacks' as a litellm_setting 2024-05-14 11:15:53 -07:00
sumanth
71e0294485 addressed comments 2024-05-14 10:05:19 +05:30
Krrish Dholakia
38988f030a fix(router.py): fix typing 2024-05-13 18:06:10 -07:00
Krrish Dholakia
5488bf4921 feat(router.py): enable default fallbacks
allow user to define a generic list of fallbacks, in case a new deployment is bad

Closes https://github.com/BerriAI/litellm/issues/3623
2024-05-13 17:49:56 -07:00
Krrish Dholakia
af9489bbfd fix(router.py): overloads fix 2024-05-13 17:04:04 -07:00
Krrish Dholakia
1312eece6d fix(router.py): overloads for better router.acompletion typing 2024-05-13 14:27:16 -07:00
Krrish Dholakia
7f6e933372 fix(router.py): give an 'info' log when fallbacks work successfully 2024-05-13 10:17:32 -07:00
Krrish Dholakia
13e1577753 fix(slack_alerting.py): don't fire spam alerts when backend api call fails 2024-05-13 10:04:43 -07:00
Krrish Dholakia
5342b3dc05 fix(router.py): fix error message to return if pre-call-checks + allowed model region 2024-05-13 09:04:38 -07:00
Krish Dholakia
1d651c6049
Merge branch 'main' into litellm_bedrock_command_r_support 2024-05-11 21:24:42 -07:00