llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-04 02:03:44 +00:00

History

ehhuang 0b5a794c27 fix: telemetry logger spams when queue is full (#3070 ) # What does this PR do? ## Test Plan Ran a stress test on chat completion endpoint locally: For 10 concurrent users over 3 minutes: Before: <img width="1440" height="201" alt="image" src="https://github.com/user-attachments/assets/24e0d580-186e-4e24-931e-2b936c5859b6" /> After: <img width="1434" height="204" alt="image" src="https://github.com/user-attachments/assets/4b806d88-f822-41e9-b25a-018cc4bec866" /> (Will send scripts in a future PR.)		2025-08-08 13:47:36 -07:00
..
inline	feat: Add moderations create api (#3020 )	2025-08-06 13:51:23 -07:00
registry	feat: Add openAI compatible APIs to Qdrant (#2465 )	2025-08-01 00:41:34 -04:00
remote	docs: fix the docs for NVIDIA Inference Provider (#3055 )	2025-08-08 11:27:55 +02:00
utils	fix: telemetry logger spams when queue is full (#3070 )	2025-08-08 13:47:36 -07:00
__init__.py	API Updates (#73 )	2024-09-17 19:51:35 -07:00
datatypes.py	feat: create unregister shield API endpoint in Llama Stack (#2853 )	2025-08-05 07:33:46 -07:00