feat(vector-io): implement global default embedding model configuration (Issue #2729)

- Add VectorStoreConfig with global default_embedding_model and default_embedding_dimension - Support environment variables LLAMA_STACK_DEFAULT_EMBEDDING_MODEL and LLAMA_STACK_DEFAULT_EMBEDDING_DIMENSION - Implement precedence: explicit model > global default > clear error (no fallback) - Update VectorIORouter with _resolve_embedding_model() precedence logic - Remove non-deterministic 'first model in run.yaml' fallback behavior - Add vector_store_config to StackRunConfig and all distribution templates - Include comprehensive unit tests for config loading and router precedence - Update documentation with configuration examples and usage patterns - Fix error messages to include 'Failed to' prefix per coding standards Resolves deterministic vector store creation by eliminating unpredictable fallbacks and providing clear configuration options at the stack level.
2025-12-08 19:10:56 +00:00 · 2025-07-25 17:06:43 -04:00 · 2025-07-25 17:06:43 -04:00 · 17fbd21c0d
commit 17fbd21c0d
parent 8422bd102a
7 changed files with 243 additions and 8 deletions
--- a/llama_stack/distributions/watsonx/build.yaml
+++ b/llama_stack/distributions/watsonx/build.yaml
@ -39,6 +39,9 @@ distribution_spec:
    - provider_type: remote::tavily-search
    - provider_type: inline::rag-runtime
    - provider_type: remote::model-context-protocol
+vector_store_config:
+  default_embedding_model: ${env.LLAMA_STACK_DEFAULT_EMBEDDING_MODEL:=all-MiniLM-L6-v2}
+  default_embedding_dimension: ${env.LLAMA_STACK_DEFAULT_EMBEDDING_DIMENSION:=384}
 image_type: venv
 additional_pip_packages:
 - sqlalchemy[asyncio]