mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-04 12:07:34 +00:00
feat: implement keyword, vector and hybrid search inside vector stores for PGVector provider (#3064)
# What does this PR do? <!-- Provide a short summary of what this PR does and why. Link to relevant issues if applicable. --> The purpose of this task is to implement `openai/v1/vector_stores/{vector_store_id}/search` for PGVector provider. It involves implementing vector similarity search, keyword search and hybrid search for `PGVectorIndex`. <!-- If resolving an issue, uncomment and update the line below --> <!-- Closes #[issue-number] --> Closes #3006 ## Test Plan <!-- Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.* --> Run unit tests: ` ./scripts/unit-tests.sh ` Run integration tests for openai vector stores: 1. Export env vars: ``` export ENABLE_PGVECTOR=true export PGVECTOR_HOST=localhost export PGVECTOR_PORT=5432 export PGVECTOR_DB=llamastack export PGVECTOR_USER=llamastack export PGVECTOR_PASSWORD=llamastack ``` 2. Create DB: ``` psql -h localhost -U postgres -c "CREATE ROLE llamastack LOGIN PASSWORD 'llamastack';" psql -h localhost -U postgres -c "CREATE DATABASE llamastack OWNER llamastack;" psql -h localhost -U llamastack -d llamastack -c "CREATE EXTENSION IF NOT EXISTS vector;" ``` 3. Install sentence-transformers: ` uv pip install sentence-transformers ` 4. Run: ``` uv run --group test pytest -s -v --stack-config="inference=inline::sentence-transformers,vector_io=remote::pgvector" --embedding-model sentence-transformers/all-MiniLM-L6-v2 tests/integration/vector_io/test_openai_vector_stores.py ``` Inspect PGVector vector stores (optional): ``` psql llamastack psql (14.18 (Homebrew)) Type "help" for help. llamastack=# \z Access privileges Schema | Name | Type | Access privileges | Column privileges | Policies --------+------------------------------------------------------+-------+-------------------+-------------------+---------- public | llamastack_kvstore | table | | | public | metadata_store | table | | | public | vector_store_pgvector_main | table | | | public | vector_store_vs_1dfbc061_1f4d_4497_9165_ecba2622ba3a | table | | | public | vector_store_vs_2085a9fb_1822_4e42_a277_c6a685843fa7 | table | | | public | vector_store_vs_2b3dae46_38be_462a_afd6_37ee5fe661b1 | table | | | public | vector_store_vs_2f438de6_f606_4561_9d50_ef9160eb9060 | table | | | public | vector_store_vs_3eeca564_2580_4c68_bfea_83dc57e31214 | table | | | public | vector_store_vs_53942163_05f3_40e0_83c0_0997c64613da | table | | | public | vector_store_vs_545bac75_8950_4ff1_b084_e221192d4709 | table | | | public | vector_store_vs_688a37d8_35b2_4298_a035_bfedf5b21f86 | table | | | public | vector_store_vs_70624d9a_f6ac_4c42_b8ab_0649473c6600 | table | | | public | vector_store_vs_73fc1dd2_e942_4972_afb1_1e177b591ac2 | table | | | public | vector_store_vs_9d464949_d51f_49db_9f87_e033b8b84ac9 | table | | | public | vector_store_vs_a1e4d724_5162_4d6d_a6c0_bdafaf6b76ec | table | | | public | vector_store_vs_a328fb1b_1a21_480f_9624_ffaa60fb6672 | table | | | public | vector_store_vs_a8981bf0_2e66_4445_a267_a8fff442db53 | table | | | public | vector_store_vs_ccd4b6a4_1efd_4984_ad03_e7ff8eadb296 | table | | | public | vector_store_vs_cd6420a4_a1fc_4cec_948c_1413a26281c9 | table | | | public | vector_store_vs_cd709284_e5cf_4a88_aba5_dc76a35364bd | table | | | public | vector_store_vs_d7a4548e_fbc1_44d7_b2ec_b664417f2a46 | table | | | public | vector_store_vs_e7f73231_414c_4523_886c_d1174eee836e | table | | | public | vector_store_vs_ffd53588_819f_47e8_bb9d_954af6f7833d | table | | | (23 rows) llamastack=# ``` Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
This commit is contained in:
parent
e96e3c4da4
commit
3130ca0a78
11 changed files with 1014 additions and 29 deletions
|
@ -12,6 +12,60 @@ That means you'll get fast and efficient vector retrieval.
|
|||
- Easy to use
|
||||
- Fully integrated with Llama Stack
|
||||
|
||||
There are three implementations of search for PGVectoIndex available:
|
||||
|
||||
1. Vector Search:
|
||||
- How it works:
|
||||
- Uses PostgreSQL's vector extension (pgvector) to perform similarity search
|
||||
- Compares query embeddings against stored embeddings using Cosine distance or other distance metrics
|
||||
- Eg. SQL query: SELECT document, embedding <=> %s::vector AS distance FROM table ORDER BY distance
|
||||
|
||||
-Characteristics:
|
||||
- Semantic understanding - finds documents similar in meaning even if they don't share keywords
|
||||
- Works with high-dimensional vector embeddings (typically 768, 1024, or higher dimensions)
|
||||
- Best for: Finding conceptually related content, handling synonyms, cross-language search
|
||||
|
||||
2. Keyword Search
|
||||
- How it works:
|
||||
- Uses PostgreSQL's full-text search capabilities with tsvector and ts_rank
|
||||
- Converts text to searchable tokens using to_tsvector('english', text). Default language is English.
|
||||
- Eg. SQL query: SELECT document, ts_rank(tokenized_content, plainto_tsquery('english', %s)) AS score
|
||||
|
||||
- Characteristics:
|
||||
- Lexical matching - finds exact keyword matches and variations
|
||||
- Uses GIN (Generalized Inverted Index) for fast text search performance
|
||||
- Scoring: Uses PostgreSQL's ts_rank function for relevance scoring
|
||||
- Best for: Exact term matching, proper names, technical terms, Boolean-style queries
|
||||
|
||||
3. Hybrid Search
|
||||
- How it works:
|
||||
- Combines both vector and keyword search results
|
||||
- Runs both searches independently, then merges results using configurable reranking
|
||||
|
||||
- Two reranking strategies available:
|
||||
- Reciprocal Rank Fusion (RRF) - (default: 60.0)
|
||||
- Weighted Average - (default: 0.5)
|
||||
|
||||
- Characteristics:
|
||||
- Best of both worlds: semantic understanding + exact matching
|
||||
- Documents appearing in both searches get boosted scores
|
||||
- Configurable balance between semantic and lexical matching
|
||||
- Best for: General-purpose search where you want both precision and recall
|
||||
|
||||
4. Database Schema
|
||||
The PGVector implementation stores data optimized for all three search types:
|
||||
CREATE TABLE vector_store_xxx (
|
||||
id TEXT PRIMARY KEY,
|
||||
document JSONB, -- Original document
|
||||
embedding vector(dimension), -- For vector search
|
||||
content_text TEXT, -- Raw text content
|
||||
tokenized_content TSVECTOR -- For keyword search
|
||||
);
|
||||
|
||||
-- Indexes for performance
|
||||
CREATE INDEX content_gin_idx ON table USING GIN(tokenized_content); -- Keyword search
|
||||
-- Vector index created automatically by pgvector
|
||||
|
||||
## Usage
|
||||
|
||||
To use PGVector in your Llama Stack project, follow these steps:
|
||||
|
@ -20,6 +74,25 @@ To use PGVector in your Llama Stack project, follow these steps:
|
|||
2. Configure your Llama Stack project to use pgvector. (e.g. remote::pgvector).
|
||||
3. Start storing and querying vectors.
|
||||
|
||||
## This is an example how you can set up your environment for using PGVector
|
||||
|
||||
1. Export env vars:
|
||||
```bash
|
||||
export ENABLE_PGVECTOR=true
|
||||
export PGVECTOR_HOST=localhost
|
||||
export PGVECTOR_PORT=5432
|
||||
export PGVECTOR_DB=llamastack
|
||||
export PGVECTOR_USER=llamastack
|
||||
export PGVECTOR_PASSWORD=llamastack
|
||||
```
|
||||
|
||||
2. Create DB:
|
||||
```bash
|
||||
psql -h localhost -U postgres -c "CREATE ROLE llamastack LOGIN PASSWORD 'llamastack';"
|
||||
psql -h localhost -U postgres -c "CREATE DATABASE llamastack OWNER llamastack;"
|
||||
psql -h localhost -U llamastack -d llamastack -c "CREATE EXTENSION IF NOT EXISTS vector;"
|
||||
```
|
||||
|
||||
## Installation
|
||||
|
||||
You can install PGVector using docker:
|
||||
|
|
|
@ -17,6 +17,7 @@ Weaviate supports:
|
|||
- Metadata filtering
|
||||
- Multi-modal retrieval
|
||||
|
||||
|
||||
## Usage
|
||||
|
||||
To use Weaviate in your Llama Stack project, follow these steps:
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue