mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-18 07:18:53 +00:00
In replay mode, inference is instantenous. We don't need to wait 15 seconds for the batch to be done. Fixing polling to do exp backoff makes things work super fast. |
||
---|---|---|
.. | ||
recordings | ||
__init__.py | ||
conftest.py | ||
test_batches.py | ||
test_batches_errors.py | ||
test_batches_idempotency.py |