llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-31 16:40:00 +00:00

History

Derek Higgins 6434cdfdab fix: Run prompt_guard model in a seperate thread The GPU model usage blocks the CPU. Move it to its own thread. Also wrap in a lock to prevent multiple simultaneous run from exhausting the GPU. Closes: #1746 Signed-off-by: Derek Higgins <derekh@redhat.com>		2025-03-28 14:19:30 +00:00
..
code_scanner	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
llama_guard	feat(agent): support multiple tool groups (#1556 )	2025-03-17 22:13:09 -07:00
prompt_guard	fix: Run prompt_guard model in a seperate thread	2025-03-28 14:19:30 +00:00
__init__.py	add missing inits	2024-11-08 17:54:24 -08:00