Commit graph

9 commits

Author SHA1 Message Date
skamenan7
32930868de tightened vector store embedding model validation
includes:
- require models to exist in registry before use
- make default_embedding_dimension mandatory when setting default model
- use first available model fallback instead of hardcoded all-MiniLM-L6-v2
- add tests for error cases and update docs
2025-09-18 10:51:50 -04:00
skamenan7
534c227058 docs: improve vector store config documentation and fix test isolation 2025-09-18 10:11:45 -04:00
skamenan7
ecb06a0384 Fix unit test to expect correct fallback model
The test was incorrectly expecting granite model as fallback.
Updated to expect all-MiniLM-L6-v2 which is the actual default.
2025-09-18 10:11:45 -04:00
skamenan7
e411099cbf Replace MissingEmbeddingModelError with IBM Granite default
- Replace error with ibm-granite/granite-embedding-125m-english default
- Based on issue #2418 for commercial compatibility and better UX
- Update tests to verify default fallback behavior
- Update documentation to reflect new precedence rules
- Remove unused MissingEmbeddingModelError class
- Update tip section to clarify fallback behavior

Resolves review comment to use default instead of error.
2025-09-18 10:11:44 -04:00
skamenan7
8e2675f50c Replace MissingEmbeddingModelError with IBM Granite default
- Replace error with ibm-granite/granite-embedding-125m-english default
- Based on issue #2418 for commercial compatibility and better UX
- Update tests to verify default fallback behavior
- Update documentation to reflect new precedence rules
- Remove unused MissingEmbeddingModelError class
- Update tip section to clarify fallback behavior

Resolves review comment to use default instead of error.
2025-09-18 10:11:44 -04:00
skamenan7
380bd1bb7a fix: update import path from distribution to core after upstream migration
Update test import path from llama_stack.distribution.routers.vector_io
to llama_stack.core.routers.vector_io to match upstream refactoring.
2025-09-18 10:11:44 -04:00
skamenan7
a368f4af40 Address review comments for global vector store configuration
- Remove incorrect 'Llama-Stack v2' version reference from documentation
- Move MissingEmbeddingModelError to llama_stack/apis/common/errors.py
- Update docstring references to point to correct exception location
- Clarify default_embedding_dimension behavior (defaults to 384)
- Update test imports and exception handling
2025-09-18 10:11:44 -04:00
skamenan7
600c3d5188 fix(tests): remove @pytest.mark.asyncio decorators from unit tests
Pre-commit hook forbids @pytest.mark.asyncio since pytest is configured
with async-mode=auto. Removed the decorators from embedding precedence tests.
2025-09-18 10:11:44 -04:00
skamenan7
17fbd21c0d feat(vector-io): implement global default embedding model configuration (Issue #2729)
- Add VectorStoreConfig with global default_embedding_model and default_embedding_dimension
- Support environment variables LLAMA_STACK_DEFAULT_EMBEDDING_MODEL and LLAMA_STACK_DEFAULT_EMBEDDING_DIMENSION
- Implement precedence: explicit model > global default > clear error (no fallback)
- Update VectorIORouter with _resolve_embedding_model() precedence logic
- Remove non-deterministic 'first model in run.yaml' fallback behavior
- Add vector_store_config to StackRunConfig and all distribution templates
- Include comprehensive unit tests for config loading and router precedence
- Update documentation with configuration examples and usage patterns
- Fix error messages to include 'Failed to' prefix per coding standards

Resolves deterministic vector store creation by eliminating unpredictable fallbacks
and providing clear configuration options at the stack level.
2025-09-18 10:11:44 -04:00