Replace MissingEmbeddingModelError with IBM Granite default

- Replace error with ibm-granite/granite-embedding-125m-english default
- Based on issue #2418 for commercial compatibility and better UX
- Update tests to verify default fallback behavior
- Update documentation to reflect new precedence rules
- Remove unused MissingEmbeddingModelError class
- Update tip section to clarify fallback behavior

Resolves review comment to use default instead of error.
This commit is contained in:
skamenan7 2025-08-04 13:01:10 -04:00
parent 380bd1bb7a
commit 8e2675f50c
4 changed files with 13 additions and 16 deletions

View file

@ -818,7 +818,7 @@ Precedence rules at runtime:
1. If `embedding_model` is explicitly passed in an API call, that value is used.
2. Otherwise the value in `vector_store_config.default_embedding_model` is used.
3. If neither is available the server will raise `MissingEmbeddingModelError` at store-creation time so mis-configuration is caught early.
3. If neither is available the server will fall back to the system default (ibm-granite/granite-embedding-125m-english).
#### Environment variables
@ -834,4 +834,4 @@ export LLAMA_STACK_DEFAULT_EMBEDDING_MODEL="sentence-transformers/all-MiniLM-L6-
llama stack run --config run.yaml
```
> Tip: If you omit `vector_store_config` entirely you **must** either pass `embedding_model=` on every `create_vector_store` call or set `LLAMA_STACK_DEFAULT_EMBEDDING_MODEL` in the environment, otherwise the server will refuse to create a vector store.
> Tip: If you omit `vector_store_config` entirely and don't set `LLAMA_STACK_DEFAULT_EMBEDDING_MODEL`, the system will fall back to the default `ibm-granite/granite-embedding-125m-english` model with 384 dimensions for vector store creation.