mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-03 18:00:36 +00:00
# What does this PR do? ## Test Plan Ran a stress test on chat completion endpoint locally: For 10 concurrent users over 3 minutes: Before: <img width="1440" height="201" alt="image" src="https://github.com/user-attachments/assets/24e0d580-186e-4e24-931e-2b936c5859b6" /> After: <img width="1434" height="204" alt="image" src="https://github.com/user-attachments/assets/4b806d88-f822-41e9-b25a-018cc4bec866" /> (Will send scripts in a future PR.) |
||
|---|---|---|
| .. | ||
| apis | ||
| cli | ||
| core | ||
| distributions | ||
| models | ||
| providers | ||
| strong_typing | ||
| testing | ||
| ui | ||
| __init__.py | ||
| env.py | ||
| log.py | ||
| schema_utils.py | ||