litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-27 11:43:54 +00:00

Author	SHA1	Message	Date
Krrish Dholakia	1ca2439eb7	fix(lowest_tpm_rpm_v2.py): use a combined tpm+rpm query in async get cache, to reduce redis client calls in high traffic	2024-04-20 16:13:11 -07:00
Krish Dholakia	a9dc93e860	Merge branch 'main' into litellm_ssl_caching_fix	2024-04-19 17:20:27 -07:00
Krrish Dholakia	b3a8c2885b	test(test_prometheus_services.py): fix testing to handle caching ping in init	2024-04-19 16:15:29 -07:00
Krrish Dholakia	5da934099f	fix(caching.py): dual cache async_batch_get_cache fix + testing this fixes a bug in usage-based-routing-v2 which was caused b/c of how the result was being returned from dual cache async_batch_get_cache. it also adds unit testing for that function (and it's sync equivalent)	2024-04-19 15:03:25 -07:00
Krrish Dholakia	7f5bcf38b7	feat(prometheus_services.py): emit proxy latency for successful llm api requests uses prometheus histogram for this	2024-04-18 16:04:35 -07:00
Krrish Dholakia	4455f0008e	fix(prometheus_services.py): add better import error statement	2024-04-13 19:03:32 -07:00
Krrish Dholakia	b0fc2b342d	fix(caching.py): don't decode a string	2024-04-13 18:48:03 -07:00
Krrish Dholakia	218976c55d	feat(prometheus_services.py): track when redis calls fail	2024-04-13 18:31:35 -07:00
Krrish Dholakia	866259f95f	feat(prometheus_services.py): monitor health of proxy adjacent services (redis / postgres / etc.)	2024-04-13 18:15:02 -07:00