Krrish Dholakia
1ca2439eb7
fix(lowest_tpm_rpm_v2.py): use a combined tpm+rpm query in async get cache, to reduce redis client calls in high traffic
2024-04-20 16:13:11 -07:00
Krish Dholakia
a9dc93e860
Merge branch 'main' into litellm_ssl_caching_fix
2024-04-19 17:20:27 -07:00
Krrish Dholakia
b3a8c2885b
test(test_prometheus_services.py): fix testing to handle caching ping in init
2024-04-19 16:15:29 -07:00
Krrish Dholakia
5da934099f
fix(caching.py): dual cache async_batch_get_cache fix + testing
...
this fixes a bug in usage-based-routing-v2 which was caused b/c of how the result was being returned from dual cache async_batch_get_cache. it also adds unit testing for that function (and it's sync equivalent)
2024-04-19 15:03:25 -07:00
Krrish Dholakia
7f5bcf38b7
feat(prometheus_services.py): emit proxy latency for successful llm api requests
...
uses prometheus histogram for this
2024-04-18 16:04:35 -07:00
Krrish Dholakia
4455f0008e
fix(prometheus_services.py): add better import error statement
2024-04-13 19:03:32 -07:00
Krrish Dholakia
b0fc2b342d
fix(caching.py): don't decode a string
2024-04-13 18:48:03 -07:00
Krrish Dholakia
218976c55d
feat(prometheus_services.py): track when redis calls fail
2024-04-13 18:31:35 -07:00
Krrish Dholakia
866259f95f
feat(prometheus_services.py): monitor health of proxy adjacent services (redis / postgres / etc.)
2024-04-13 18:15:02 -07:00