chore: increase livenessProbe and readinessProbe timeouts

- our litellm deployment keeps restarting because the pods fail readiness and liveness probes
- the default in k8s is 1s, the pod isn't always able to respond it time
- it seems to happen particularly often when the process flushes spend logs to the database
This commit is contained in:
Ashwin Madavan 2025-04-14 20:58:48 -04:00 committed by GitHub
parent db857c74d4
commit 522da2be8a
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -120,10 +120,12 @@ spec:
httpGet:
path: /health/liveliness
port: http
timeoutSeconds: 10
readinessProbe:
httpGet:
path: /health/readiness
port: http
timeoutSeconds: 10
# Give the container time to start up. Up to 5 minutes (10 * 30 seconds)
startupProbe:
httpGet: