mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-23 06:09:40 +00:00

skamenan7 474b50b422 Add configurable embedding models for vector IO providers

This change lets users configure default embedding models at the provider level instead of always relying on system defaults. Each vector store provider can now specify an embedding_model and optional embedding_dimension in their config.

Key features:
- Auto-dimension lookup for standard models from the registry
- Support for Matryoshka embeddings with custom dimensions
- Three-tier priority: explicit params > provider config > system fallback
- Full backward compatibility - existing setups work unchanged
- Comprehensive test coverage with 20 test cases

Updated all vector IO providers (FAISS, Chroma, Milvus, Qdrant, etc.) with the new config fields and added detailed documentation with examples.

Fixes #2729

2025-07-15 16:46:40 -04:00

1.4 KiB

Raw Blame History

remote::weaviate

Description

Weaviate is a vector database provider for Llama Stack. It allows you to store and query vectors directly within a Weaviate database. That means you're not limited to storing vectors in memory or in a separate service.

Features

Weaviate supports:

Store embeddings and their metadata
Vector search
Full-text search
Hybrid search
Document storage
Metadata filtering
Multi-modal retrieval

Usage

To use Weaviate in your Llama Stack project, follow these steps:

Install the necessary dependencies.
Configure your Llama Stack project to use chroma.
Start storing and querying vectors.

Installation

To install Weaviate see the Weaviate quickstart documentation.

Documentation

See Weaviate's documentation for more details about Weaviate in general.

Configuration

Field	Type	Required	Default	Description
`embedding_model`	`str \| None`	No		Optional default embedding model for this provider. If not specified, will use system default.
`embedding_dimension`	`int \| None`	No		Optional embedding dimension override. Only needed for models with variable dimensions (e.g., Matryoshka embeddings). If not specified, will auto-lookup from model registry.

Sample Configuration

{}

1.4 KiB Raw Blame History