Commit graph

21 commits

Author SHA1 Message Date
Ishaan Jaff
8f155327f6 [Fix] Router cooldown logic - use % thresholds instead of allowed fails to cooldown deployments (#5698)
* move cooldown logic to it's own helper

* add new track deployment metrics folder

* increment success, fails for deployment in current minute

* fix cooldown logic

* fix test_aaarouter_dynamic_cooldown_message_retry_time

* fix test_single_deployment_no_cooldowns_test_prod_mock_completion_calls

* clean up get from deployment test

* fix _async_get_healthy_deployments

* add mock InternalServerError

* test deployment failing 25% requests

* add test_high_traffic_cooldowns_one_bad_deployment

* fix vertex load test

* add test for rate limit error models in cool down

* change default cooldown time

* fix cooldown message time

* fix cooldown on 429 error

* fix doc string for _should_cooldown_deployment

* fix sync cooldown logic router
2024-09-14 18:01:19 -07:00
Krish Dholakia
dec53961f7 LiteLLM Minor Fixes and Improvements (11/09/2024) (#5634)
* fix(caching.py): set ttl for async_increment cache

fixes issue where ttl for redis client was not being set on increment_cache

Fixes https://github.com/BerriAI/litellm/issues/5609

* fix(caching.py): fix increment cache w/ ttl for sync increment cache on redis

Fixes https://github.com/BerriAI/litellm/issues/5609

* fix(router.py): support adding retry policy + allowed fails policy via config.yaml

* fix(router.py): don't cooldown single deployments

No point, as there's no other deployment to loadbalance with.

* fix(user_api_key_auth.py): support setting allowed email domains on jwt tokens

Closes https://github.com/BerriAI/litellm/issues/5605

* docs(token_auth.md): add user upsert + allowed email domain to jwt auth docs

* fix(litellm_pre_call_utils.py): fix dynamic key logging when team id is set

Fixes issue where key logging would not be set if team metadata was not none

* fix(secret_managers/main.py): load environment variables correctly

Fixes issue where os.environ/ was not being loaded correctly

* test(test_router.py): fix test

* feat(spend_tracking_utils.py): support logging additional usage params - e.g. prompt caching values for deepseek

* test: fix tests

* test: fix test

* test: fix test

* test: fix test

* test: fix test
2024-09-11 22:36:06 -07:00
Ishaan Jaff
581de720f9 fix router retries tests 2024-08-20 13:02:24 -07:00
Ishaan Jaff
fb16ff2335 fix don't retry errors when no healthy deployments available 2024-08-20 12:17:05 -07:00
Ishaan Jaff
5e2f962ba3 test + never retry on 404 errors 2024-08-20 11:59:43 -07:00
Krrish Dholakia
6441a1c398 test(test_amazing_vertex_completion.py): add retries for 'Content has no parts.' error in vertex test 2024-06-03 14:06:36 -07:00
Krrish Dholakia
96120ab2c5 fix(router.py): fix should_retry logic for authentication errors 2024-06-03 13:12:00 -07:00
Ishaan Jaff
e149ca73f6 Merge pull request #3963 from BerriAI/litellm_set_allowed_fail_policy
[FEAT]- set custom AllowedFailsPolicy on litellm.Router
2024-06-01 17:57:11 -07:00
Ishaan Jaff
fb9a174462 feat - set allowed fails policy 2024-06-01 17:39:44 -07:00
Ishaan Jaff
5236701d21 add test 2024-06-01 17:03:53 -07:00
Krrish Dholakia
91aa2a705e fix(router.py): fix should_retry logic 2024-05-31 23:27:43 -07:00
Ishaan Jaff
dde0366159 test - router retry policy 2024-05-11 19:58:17 -07:00
Ishaan Jaff
6adaaede10 fix - unit tests for router retries 2024-05-11 19:10:33 -07:00
Ishaan Jaff
06e0c4c171 test - unit tests for time to sleep when there are rate limit errors 2024-05-11 18:13:28 -07:00
Ishaan Jaff
58383216ee tests - unit test router retry logic 2024-05-11 17:31:01 -07:00
Krrish Dholakia
9f930f3cd8 fix(router.py): fix router retry policy logic 2024-05-04 23:02:50 -07:00
Ishaan Jaff
059e2f0168 test - test setting retry policies per model groups 2024-05-04 20:40:56 -07:00
Ishaan Jaff
805ea96a1b router set dynamic retry policies 2024-05-04 18:13:43 -07:00
Ishaan Jaff
d9b44b2411 test - router retry policy 2024-05-04 17:30:30 -07:00
Ishaan Jaff
1dc788a209 test router - retry policy 2024-05-04 17:06:34 -07:00
Krrish Dholakia
a12878b0f8 fix(router.py): cooldown deployments, for 401 errors 2024-04-30 17:54:00 -07:00