Krrish Dholakia
01a1a8f731
fix(caching.py): dual cache async_batch_get_cache fix + testing
...
this fixes a bug in usage-based-routing-v2 which was caused b/c of how the result was being returned from dual cache async_batch_get_cache. it also adds unit testing for that function (and it's sync equivalent)
2024-04-19 15:03:25 -07:00
Krrish Dholakia
0f95a824c4
feat(prometheus_services.py): emit proxy latency for successful llm api requests
...
uses prometheus histogram for this
2024-04-18 16:04:35 -07:00
Krrish Dholakia
67d0d5e356
fix(prometheus_services.py): add better import error statement
2024-04-13 19:03:32 -07:00
Krrish Dholakia
bef24cd4ab
fix(caching.py): don't decode a string
2024-04-13 18:48:03 -07:00
Krrish Dholakia
9f42d15713
feat(prometheus_services.py): track when redis calls fail
2024-04-13 18:31:35 -07:00
Krrish Dholakia
4e81acf2c6
feat(prometheus_services.py): monitor health of proxy adjacent services (redis / postgres / etc.)
2024-04-13 18:15:02 -07:00