llama-stack-mirror/llama_stack/providers/inline
skamenan7 474b50b422 Add configurable embedding models for vector IO providers
This change lets users configure default embedding models at the provider level instead of always relying on system defaults. Each vector store provider can now specify an embedding_model and optional embedding_dimension in their config.

Key features:
- Auto-dimension lookup for standard models from the registry
- Support for Matryoshka embeddings with custom dimensions
- Three-tier priority: explicit params > provider config > system fallback
- Full backward compatibility - existing setups work unchanged
- Comprehensive test coverage with 20 test cases

Updated all vector IO providers (FAISS, Chroma, Milvus, Qdrant, etc.) with the new config fields and added detailed documentation with examples.

Fixes #2729
2025-07-15 16:46:40 -04:00
..
agents chore(api): add mypy coverage to meta_reference_safety (#2661) 2025-07-09 10:22:34 +02:00
datasetio chore(refact): move paginate_records fn outside of datasetio (#2137) 2025-05-12 10:56:14 -07:00
eval chore: remove nested imports (#2515) 2025-06-26 08:01:05 +05:30
files/localfs refactor(env)!: enhanced environment variable substitution (#2490) 2025-06-26 08:20:08 +05:30
inference refactor(llama4): remove duplicate implementation, update imports to llama-models, add comprehensive test for tool calling fix (issue #2584)\n\n- Removes all old llama4 code from llama-stack\n- Updates all relevant imports to use llama-models\n- Adds robust pytest to demonstrate arguments_json fix\n- Updates config/scripts as needed for new structure\n- Resolves merge conflicts with updated main branch\n- Fixes mypy and ruff issues 2025-07-15 13:21:33 -04:00
ios/inference chore: removed executorch submodule (#1265) 2025-02-25 21:57:21 -08:00
post_training chore: add mypy post training (#2675) 2025-07-09 15:44:39 +02:00
safety ci: test safety with starter (#2628) 2025-07-09 16:53:50 +02:00
scoring fix: allow default empty vars for conditionals (#2570) 2025-07-01 14:42:05 +02:00
telemetry feat: improve telemetry (#2590) 2025-07-04 17:29:09 +02:00
tool_runtime feat: Add ChunkMetadata to Chunk (#2497) 2025-06-25 15:55:23 -04:00
vector_io Add configurable embedding models for vector IO providers 2025-07-15 16:46:40 -04:00
__init__.py impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00