llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2026-01-02 03:14:30 +00:00

History

Derek Higgins 6434cdfdab fix: Run prompt_guard model in a seperate thread The GPU model usage blocks the CPU. Move it to its own thread. Also wrap in a lock to prevent multiple simultaneous run from exhausting the GPU. Closes: #1746 Signed-off-by: Derek Higgins <derekh@redhat.com>		2025-03-28 14:19:30 +00:00
..
__init__.py	chore: fix typing hints for get_provider_impl deps arguments (#1544 )	2025-03-11 10:07:28 -07:00
config.py	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
prompt_guard.py	fix: Run prompt_guard model in a seperate thread	2025-03-28 14:19:30 +00:00