mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-04 04:04:14 +00:00
# What does this PR do? ## Test Plan Ran a stress test on chat completion endpoint locally: For 10 concurrent users over 3 minutes: Before: <img width="1440" height="201" alt="image" src="https://github.com/user-attachments/assets/24e0d580-186e-4e24-931e-2b936c5859b6" /> After: <img width="1434" height="204" alt="image" src="https://github.com/user-attachments/assets/4b806d88-f822-41e9-b25a-018cc4bec866" /> (Will send scripts in a future PR.) |
||
---|---|---|
.. | ||
apis | ||
cli | ||
core | ||
distributions | ||
models | ||
providers | ||
strong_typing | ||
testing | ||
ui | ||
__init__.py | ||
env.py | ||
log.py | ||
schema_utils.py |