llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-08 19:10:56 +00:00

History

slekkala1 cb6a5e2687 fix: fix segfault in load model (#3879 ) # What does this PR do? Fix segfault with load model The cc-vec integration failed with segfault when used with default embedding model on macOS `model_id: nomic-ai/nomic-embed-text-v1.5` and `provider_id: sentence-transformers` Checked crash report and see this is due to torch OPENMP settings. Constrainting to 1 thread works without crashes. ## Test Plan Tested with cc-vec integration 1. start server llama stack run starter 2. Do the setup in https://github.com/raghotham/cc-vec to set env variables and try `uv run cc-vec index --url-patterns "%.github.io" --vector-store-name "ml-research" --limit 50 --chunk-size 800 --overlap 400`		2025-10-21 12:21:06 -07:00
..
__init__.py	chore: enable pyupgrade fixes (#1806 )	2025-05-01 14:23:50 -07:00
embedding_mixin.py	fix: fix segfault in load model (#3879 )	2025-10-21 12:21:06 -07:00
inference_store.py	feat(stores)!: use backend storage references instead of configs (#3697 )	2025-10-20 13:20:09 -07:00
litellm_openai_mixin.py	feat(api)!: support extra_body to embeddings and vector_stores APIs (#3794 )	2025-10-12 19:01:52 -07:00
model_registry.py	feat: use SecretStr for inference provider auth credentials (#3724 )	2025-10-10 07:32:50 -07:00
openai_compat.py	fix: Update watsonx.ai provider to use LiteLLM mixin and list all models (#3674 )	2025-10-08 07:29:43 -04:00
openai_mixin.py	fix(openai_mixin): no yelling for model listing if API keys are not provided (#3826 )	2025-10-16 10:12:13 -07:00
prompt_adapter.py	chore!: Safety api refactoring to use OpenAIMessageParam (#3796 )	2025-10-12 08:01:00 -07:00