Commit graph

657 commits

Author SHA1 Message Date
Ishaan Jaff
dfe874c9e5 test - client side fallbacks 2024-06-10 15:00:36 -07:00
Ishaan Jaff
a9006b965f fix - support fallbacks as list 2024-06-10 14:32:28 -07:00
Krrish Dholakia
6306914e56 fix(types/router.py): modelgroupinfo to handle mode being None and supported_openai_params not being a list 2024-06-08 20:13:45 -07:00
Krish Dholakia
471be6670c
Merge pull request #4049 from BerriAI/litellm_cleanup_traceback
refactor: replace 'traceback.print_exc()' with logging library
2024-06-07 08:03:22 -07:00
Krish Dholakia
1742141fb6
Merge pull request #4046 from BerriAI/litellm_router_order
feat(router.py): enable settting 'order' for a deployment in model list
2024-06-06 16:37:03 -07:00
Krish Dholakia
677e0255c8
Merge branch 'main' into litellm_cleanup_traceback 2024-06-06 16:32:08 -07:00
Krrish Dholakia
b590e6607c feat(scheduler.py): support redis caching for req. prioritization
enables req. prioritization to work across multiple instances of litellm
2024-06-06 14:19:21 -07:00
Krrish Dholakia
6cca5612d2 refactor: replace 'traceback.print_exc()' with logging library
allows error logs to be in json format for otel logging
2024-06-06 13:47:43 -07:00
Krrish Dholakia
a7dcf25722 feat(router.py): enable settting 'order' for a deployment in model list
Allows user to control which model gets called first in model group
2024-06-06 09:46:51 -07:00
Krrish Dholakia
f3d78532f9 feat(assistants/main.py): add assistants api streaming support 2024-06-04 16:30:35 -07:00
Krish Dholakia
7311e82f47
Merge pull request #3992 from BerriAI/litellm_router_default_request_timeout
fix(router.py): use `litellm.request_timeout` as default for router clients
2024-06-03 21:37:38 -07:00
Krish Dholakia
5ee3b0f30f
Merge pull request #3996 from BerriAI/litellm_azure_assistants_api_support
feat(assistants/main.py): Azure Assistants API support
2024-06-03 21:05:03 -07:00
Krrish Dholakia
7163bce37b feat(assistants/main.py): Closes https://github.com/BerriAI/litellm/issues/3993 2024-06-03 18:47:05 -07:00
Krrish Dholakia
1de5235ba0 fix(router.py): use litellm.request_timeout as default for router clients 2024-06-03 14:19:53 -07:00
Krrish Dholakia
a019fd05e3 fix(router.py): fix should_retry logic for authentication errors 2024-06-03 13:12:00 -07:00
Ishaan Jaff
2ce5dc0dfd ci/cd run again 2024-06-01 21:19:32 -07:00
Ishaan Jaff
309a66692f fix test_rate_limit[usage-based-routing-True-3-2] 2024-06-01 21:18:23 -07:00
Ishaan Jaff
373a41ca6d fix async_function_with_retries 2024-06-01 19:00:22 -07:00
Ishaan Jaff
054456c50e
Merge pull request #3963 from BerriAI/litellm_set_allowed_fail_policy
[FEAT]- set custom AllowedFailsPolicy on litellm.Router
2024-06-01 17:57:11 -07:00
Ishaan Jaff
fb49d036fb
Merge pull request #3962 from BerriAI/litellm_return_num_rets_max_exceptions
[Feat] return `num_retries` and `max_retries` in exceptions
2024-06-01 17:48:38 -07:00
Ishaan Jaff
d4378143f1 fix current_attempt, num_retries not defined 2024-06-01 17:42:37 -07:00
Ishaan Jaff
eb203c051a feat - set custom AllowedFailsPolicy 2024-06-01 17:26:21 -07:00
Ishaan Jaff
cfc55b39a9 fix - return in LITELLM_EXCEPTION_TYPES 2024-06-01 17:05:33 -07:00
Ishaan Jaff
286d42a881 feat - add num retries and max retries in exception 2024-06-01 16:53:00 -07:00
Krrish Dholakia
7715267989 fix(router.py): simplify scheduler
move the scheduler poll queuing logic into the router class, making it easier to use
2024-06-01 16:09:57 -07:00
Krish Dholakia
8375e9621c
Merge pull request #3954 from BerriAI/litellm_simple_request_prioritization
feat(scheduler.py): add request prioritization scheduler
2024-05-31 23:29:09 -07:00
Krrish Dholakia
381247a095 fix(router.py): fix param 2024-05-31 21:52:23 -07:00
Krrish Dholakia
e49325b234 fix(router.py): fix cooldown logic for usage-based-routing-v2 pre-call-checks 2024-05-31 21:32:01 -07:00
Krish Dholakia
08bae3185a
Merge pull request #3936 from BerriAI/litellm_assistants_api_proxy
feat(proxy_server.py): add assistants api endpoints to proxy server
2024-05-31 18:43:22 -07:00
Ishaan Jaff
f569e61638 fix - model hub supported_openai_params 2024-05-31 07:27:21 -07:00
Krrish Dholakia
e2b34165e7 feat(proxy_server.py): add assistants api endpoints to proxy server 2024-05-30 22:44:43 -07:00
Krish Dholakia
d3a247bf20
Merge pull request #3928 from BerriAI/litellm_audio_speech_endpoint
feat(main.py): support openai tts endpoint
2024-05-30 17:30:42 -07:00
Krrish Dholakia
93166cdabf fix(openai.py): fix openai response for /audio/speech endpoint 2024-05-30 16:41:06 -07:00
Krrish Dholakia
32bfb685f5 fix(router.py): cooldown on 404 errors
https://github.com/BerriAI/litellm/issues/3884
2024-05-30 10:57:38 -07:00
Krrish Dholakia
1d18ca6a7d fix(router.py): security fix - don't show api key in invalid model setup error message 2024-05-29 16:14:57 -07:00
Krish Dholakia
e838bd1c79
Merge branch 'main' into litellm_batch_completions 2024-05-28 22:38:05 -07:00
Ishaan Jaff
9ab96e12ed fix - update abatch_completion docstring 2024-05-28 22:27:09 -07:00
Ishaan Jaff
473ec66b84 feat - router add abatch_completion 2024-05-28 22:19:33 -07:00
Krrish Dholakia
e3000504f9 fix(router.py): support batch completions fastest response streaming 2024-05-28 21:51:09 -07:00
Krrish Dholakia
1ebae6e7b0 fix(router.py): support comma-separated model list for batch completion fastest response 2024-05-28 21:34:37 -07:00
Krrish Dholakia
20106715d5 feat(proxy_server.py): enable batch completion fastest response calls on proxy
introduces new `fastest_response` flag for enabling the call
2024-05-28 20:09:31 -07:00
Krrish Dholakia
ecd182eb6a feat(router.py): support fastest response batch completion call
returns fastest response. cancels others.
2024-05-28 19:44:41 -07:00
Krish Dholakia
37f11162d2
Merge pull request #3847 from paneru-rajan/improve-validate-fallback-method
Improve validate-fallbacks method
2024-05-27 18:18:35 -07:00
Krrish Dholakia
67da24f144 fix(fix-'get_model_group_info'-to-return-a-default-value-if-unmapped-model-group): allows model hub to return all model groupss 2024-05-27 13:53:01 -07:00
Ishaan Jaff
b5f883ab74 feat - show openai params on model hub ui 2024-05-27 08:49:51 -07:00
Krrish Dholakia
22b6b99b34 feat(proxy_server.py): expose new /model_group/info endpoint
returns model-group level info on supported params, max tokens, pricing, etc.
2024-05-26 14:07:35 -07:00
sujan100000
e02328c9f1 Improve validate-fallbacks method
* No need to check for fallback_params length
* Instead of asserting, used if condition and raised valueError
* Improved Error message
2024-05-26 19:09:07 +09:30
Krrish Dholakia
ae787645da fix(router.py): fix pre call check
only check if response_format supported by model, if pre-call check enabled
2024-05-24 20:09:15 -07:00
Krrish Dholakia
8dec87425e feat(slack_alerting.py): refactor region outage alerting to do model based alerting instead
Unable to extract azure region from api base, makes sense to start with model alerting and then move to region
2024-05-24 19:10:33 -07:00
Ishaan Jaff
7d2c76f640 fix test_filter_invalid_params_pre_call_check 2024-05-23 21:16:32 -07:00