llama-stack-mirror/llama_stack/distribution
skamenan7 474b50b422 Add configurable embedding models for vector IO providers
This change lets users configure default embedding models at the provider level instead of always relying on system defaults. Each vector store provider can now specify an embedding_model and optional embedding_dimension in their config.

Key features:
- Auto-dimension lookup for standard models from the registry
- Support for Matryoshka embeddings with custom dimensions
- Three-tier priority: explicit params > provider config > system fallback
- Full backward compatibility - existing setups work unchanged
- Comprehensive test coverage with 20 test cases

Updated all vector IO providers (FAISS, Chroma, Milvus, Qdrant, etc.) with the new config fields and added detailed documentation with examples.

Fixes #2729
2025-07-15 16:46:40 -04:00
..
access_control fix: auth sql store: user is owner policy (#2674) 2025-07-10 14:40:32 -07:00
routers Add configurable embedding models for vector IO providers 2025-07-15 16:46:40 -04:00
routing_tables fix: AccessDeniedError leads to HTTP 500 instead of error 403 (#2595) 2025-07-03 10:50:49 -07:00
server fix: properly represent paths in server logs (#2698) 2025-07-10 10:19:12 -04:00
store fix: store configs (#2593) 2025-07-03 10:07:23 -07:00
ui chore: remove nested imports (#2515) 2025-06-26 08:01:05 +05:30
utils chore: update pre-commit hook versions (#2708) 2025-07-10 16:47:59 +02:00
__init__.py API Updates (#73) 2024-09-17 19:51:35 -07:00
build.py chore: bump python supported version to 3.12 (#2475) 2025-06-24 09:22:04 +05:30
build_conda_env.sh chore: fix build script bug (#2507) 2025-06-24 12:05:22 -07:00
build_container.sh fix: container build on podman (#2723) 2025-07-11 16:25:33 +02:00
build_venv.sh chore: remove straggler references to llama-models (#1345) 2025-03-01 14:26:03 -08:00
client.py chore: make cprint write to stderr (#2250) 2025-05-24 23:39:57 -07:00
common.sh feat(pre-commit): enhance pre-commit hooks with additional checks (#2014) 2025-04-30 11:35:49 -07:00
configure.py fix: store configs (#2593) 2025-07-03 10:07:23 -07:00
datatypes.py feat(auth): support github tokens (#2509) 2025-07-08 11:02:36 -07:00
distribution.py ci: fix external provider test (#2438) 2025-06-12 16:14:32 +02:00
inspect.py chore: use starlette built-in Route class (#2267) 2025-05-28 09:53:33 -07:00
library_client.py refactor: unify stream and non-stream impls for responses (#2388) 2025-06-05 17:48:09 +02:00
providers.py feat: consolidate most distros into "starter" (#2516) 2025-07-04 15:58:03 +02:00
request_headers.py feat: fine grained access control policy (#2264) 2025-06-03 14:51:12 -07:00
resolver.py fix: Some missed env variable changes from PR 2490 (#2538) 2025-06-26 17:59:15 -07:00
stack.py ci: test safety with starter (#2628) 2025-07-09 16:53:50 +02:00
start_stack.sh refactor: remove container from list of run image types (#2178) 2025-06-02 09:57:55 +02:00