llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 18:00:36 +00:00

History

Derek Higgins 5bbca56cfc fix: Make SentenceTransformer embedding operations non-blocking (#3335 ) - Wrap model loading with asyncio.to_thread() to prevent blocking during model download/initialization - Wrap encoding operations with asyncio.to_thread() to run in background thread - Convert _load_sentence_transformer_model() to async method This ensures the async event loop remains responsive during embedding operations. Closes: #3332 Signed-off-by: Derek Higgins <derekh@redhat.com> Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>		2025-09-04 13:58:41 -04:00
..
inline	refactor: use generic WeightedInMemoryAggregator for hybrid search in SQLiteVecIndex (#3303 )	2025-09-02 10:38:35 -07:00
registry	chore(python-deps): replace ibm_watson_machine_learning with ibm_watsonx_ai (#3302 )	2025-09-03 11:33:35 +02:00
remote	feat(tests): auto-merge all model list responses and unify recordings (#3320 )	2025-09-03 11:33:03 -07:00
utils	fix: Make SentenceTransformer embedding operations non-blocking (#3335 )	2025-09-04 13:58:41 -04:00
__init__.py	API Updates (#73 )	2024-09-17 19:51:35 -07:00
datatypes.py	feat: create unregister shield API endpoint in Llama Stack (#2853 )	2025-08-05 07:33:46 -07:00