Commit graph

501 commits

Author SHA1 Message Date
Krrish Dholakia
43991afc34 feat(scheduler.py): support redis caching for req. prioritization
enables req. prioritization to work across multiple instances of litellm
2024-06-06 14:19:21 -07:00
Krrish Dholakia
e391e30285 refactor: replace 'traceback.print_exc()' with logging library
allows error logs to be in json format for otel logging
2024-06-06 13:47:43 -07:00
Krrish Dholakia
005128addc feat(router.py): enable settting 'order' for a deployment in model list
Allows user to control which model gets called first in model group
2024-06-06 09:46:51 -07:00
Krrish Dholakia
20cb525a5c feat(assistants/main.py): add assistants api streaming support 2024-06-04 16:30:35 -07:00
Krish Dholakia
73ae4860c0 Merge pull request #3992 from BerriAI/litellm_router_default_request_timeout
fix(router.py): use `litellm.request_timeout` as default for router clients
2024-06-03 21:37:38 -07:00
Krish Dholakia
127d1457de Merge pull request #3996 from BerriAI/litellm_azure_assistants_api_support
feat(assistants/main.py): Azure Assistants API support
2024-06-03 21:05:03 -07:00
Krrish Dholakia
a2ba63955a feat(assistants/main.py): Closes https://github.com/BerriAI/litellm/issues/3993 2024-06-03 18:47:05 -07:00
Krrish Dholakia
ae52e7559e fix(router.py): use litellm.request_timeout as default for router clients 2024-06-03 14:19:53 -07:00
Krrish Dholakia
96120ab2c5 fix(router.py): fix should_retry logic for authentication errors 2024-06-03 13:12:00 -07:00
Ishaan Jaff
0acb6e5180 ci/cd run again 2024-06-01 21:19:32 -07:00
Ishaan Jaff
2d1aaf5cf7 fix test_rate_limit[usage-based-routing-True-3-2] 2024-06-01 21:18:23 -07:00
Ishaan Jaff
ad920be3bf fix async_function_with_retries 2024-06-01 19:00:22 -07:00
Ishaan Jaff
e149ca73f6 Merge pull request #3963 from BerriAI/litellm_set_allowed_fail_policy
[FEAT]- set custom AllowedFailsPolicy on litellm.Router
2024-06-01 17:57:11 -07:00
Ishaan Jaff
dd25d83087 Merge pull request #3962 from BerriAI/litellm_return_num_rets_max_exceptions
[Feat] return `num_retries` and `max_retries` in exceptions
2024-06-01 17:48:38 -07:00
Ishaan Jaff
728fead32c fix current_attempt, num_retries not defined 2024-06-01 17:42:37 -07:00
Ishaan Jaff
a11175c05b feat - set custom AllowedFailsPolicy 2024-06-01 17:26:21 -07:00
Ishaan Jaff
a485b19215 fix - return in LITELLM_EXCEPTION_TYPES 2024-06-01 17:05:33 -07:00
Ishaan Jaff
2341d99bdc feat - add num retries and max retries in exception 2024-06-01 16:53:00 -07:00
Krrish Dholakia
4ffbd80584 fix(router.py): simplify scheduler
move the scheduler poll queuing logic into the router class, making it easier to use
2024-06-01 16:09:57 -07:00
Krish Dholakia
1529f665cc Merge pull request #3954 from BerriAI/litellm_simple_request_prioritization
feat(scheduler.py): add request prioritization scheduler
2024-05-31 23:29:09 -07:00
Krrish Dholakia
9a3789ce69 fix(router.py): fix param 2024-05-31 21:52:23 -07:00
Krrish Dholakia
6221fabecf fix(router.py): fix cooldown logic for usage-based-routing-v2 pre-call-checks 2024-05-31 21:32:01 -07:00
Krish Dholakia
c049b6b4af Merge pull request #3936 from BerriAI/litellm_assistants_api_proxy
feat(proxy_server.py): add assistants api endpoints to proxy server
2024-05-31 18:43:22 -07:00
Ishaan Jaff
f6617c94e3 fix - model hub supported_openai_params 2024-05-31 07:27:21 -07:00
Krrish Dholakia
2fdf4a7bb4 feat(proxy_server.py): add assistants api endpoints to proxy server 2024-05-30 22:44:43 -07:00
Krish Dholakia
73e3dba2f6 Merge pull request #3928 from BerriAI/litellm_audio_speech_endpoint
feat(main.py): support openai tts endpoint
2024-05-30 17:30:42 -07:00
Krrish Dholakia
eb159b64e1 fix(openai.py): fix openai response for /audio/speech endpoint 2024-05-30 16:41:06 -07:00
Krrish Dholakia
66e08cac9b fix(router.py): cooldown on 404 errors
https://github.com/BerriAI/litellm/issues/3884
2024-05-30 10:57:38 -07:00
Krrish Dholakia
482929bece fix(router.py): security fix - don't show api key in invalid model setup error message 2024-05-29 16:14:57 -07:00
Krish Dholakia
4fd3994b4e Merge branch 'main' into litellm_batch_completions 2024-05-28 22:38:05 -07:00
Ishaan Jaff
17c6ea2272 fix - update abatch_completion docstring 2024-05-28 22:27:09 -07:00
Ishaan Jaff
aca5118a83 feat - router add abatch_completion 2024-05-28 22:19:33 -07:00
Krrish Dholakia
98ebcad52d fix(router.py): support batch completions fastest response streaming 2024-05-28 21:51:09 -07:00
Krrish Dholakia
012bde0b07 fix(router.py): support comma-separated model list for batch completion fastest response 2024-05-28 21:34:37 -07:00
Krrish Dholakia
792b25c772 feat(proxy_server.py): enable batch completion fastest response calls on proxy
introduces new `fastest_response` flag for enabling the call
2024-05-28 20:09:31 -07:00
Krrish Dholakia
3676c00235 feat(router.py): support fastest response batch completion call
returns fastest response. cancels others.
2024-05-28 19:44:41 -07:00
Krish Dholakia
01dc798876 Merge pull request #3847 from paneru-rajan/improve-validate-fallback-method
Improve validate-fallbacks method
2024-05-27 18:18:35 -07:00
Krrish Dholakia
23b28601b7 fix(fix-'get_model_group_info'-to-return-a-default-value-if-unmapped-model-group): allows model hub to return all model groupss 2024-05-27 13:53:01 -07:00
Ishaan Jaff
69ea7d57fb feat - show openai params on model hub ui 2024-05-27 08:49:51 -07:00
Krrish Dholakia
8e9a3fef81 feat(proxy_server.py): expose new /model_group/info endpoint
returns model-group level info on supported params, max tokens, pricing, etc.
2024-05-26 14:07:35 -07:00
sujan100000
45dd4d37d0 Improve validate-fallbacks method
* No need to check for fallback_params length
* Instead of asserting, used if condition and raised valueError
* Improved Error message
2024-05-26 19:09:07 +09:30
Krrish Dholakia
cd34d00d80 fix(router.py): fix pre call check
only check if response_format supported by model, if pre-call check enabled
2024-05-24 20:09:15 -07:00
Krrish Dholakia
4536ed6f6e feat(slack_alerting.py): refactor region outage alerting to do model based alerting instead
Unable to extract azure region from api base, makes sense to start with model alerting and then move to region
2024-05-24 19:10:33 -07:00
Ishaan Jaff
84f8ead4a1 fix test_filter_invalid_params_pre_call_check 2024-05-23 21:16:32 -07:00
Krrish Dholakia
c50074a0b7 feat(ui/model_dashboard.tsx): add databricks models via admin ui 2024-05-23 20:28:54 -07:00
Krrish Dholakia
c989b92801 feat(router.py): Fixes https://github.com/BerriAI/litellm/issues/3769 2024-05-21 17:24:51 -07:00
Krish Dholakia
c0e43a7296 Merge pull request #3412 from sumanth13131/usage-based-routing-ttl-on-cache
usage-based-routing-ttl-on-cache
2024-05-21 07:58:41 -07:00
Ishaan Jaff
ef9372ce00 fix add doc string for abatch_completion_one_model_multiple_requests 2024-05-20 17:51:08 -07:00
Ishaan Jaff
13c787f9b5 feat - add abatch_completion_one_model_multiple_requests 2024-05-20 17:47:25 -07:00
Ishaan Jaff
7e6c9274fc Merge branch 'main' into litellm_standardize_slack_exception_msg_format 2024-05-20 16:39:41 -07:00