mirror of
https://github.com/BerriAI/litellm.git
synced 2025-04-27 11:43:54 +00:00
if an endpoint is slow - it's completion time might not be updated till the call is completed. This prevents us from overloading those endpoints, in a simple way. |
||
---|---|---|
.. | ||
least_busy.py | ||
lowest_latency.py | ||
lowest_tpm_rpm.py | ||
lowest_tpm_rpm_v2.py |