chore: increase livenessProbe and readinessProbe timeouts

- our litellm deployment keeps restarting because the pods fail readiness and liveness probes - the default in k8s is 1s, the pod isn't always able to respond it time - it seems to happen particularly often when the process flushes spend logs to the database
2025-04-24 18:24:20 +00:00 · 2025-04-14 20:58:48 -04:00 · 2025-04-14 20:58:48 -04:00 · 522da2be8a
commit 522da2be8a
parent db857c74d4
1 changed files with 2 additions and 0 deletions
--- a/deploy/charts/litellm-helm/templates/deployment.yaml
+++ b/deploy/charts/litellm-helm/templates/deployment.yaml
@ -120,10 +120,12 @@ spec:
            httpGet:
              path: /health/liveliness
              port: http
+            timeoutSeconds: 10
          readinessProbe:
            httpGet:
              path: /health/readiness
              port: http
+            timeoutSeconds: 10
          # Give the container time to start up.  Up to 5 minutes (10 * 30 seconds)
          startupProbe:
            httpGet: