mirror of
				https://github.com/meta-llama/llama-stack.git
				synced 2025-10-25 01:01:13 +00:00 
			
		
		
		
	
	
		
			4 commits
		
	
	
	| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|  | 3130ca0a78 | feat: implement keyword, vector and hybrid search inside vector stores for PGVector provider (#3064) # What does this PR do?
<!-- Provide a short summary of what this PR does and why. Link to
relevant issues if applicable. -->
The purpose of this task is to implement
`openai/v1/vector_stores/{vector_store_id}/search` for PGVector
provider. It involves implementing vector similarity search, keyword
search and hybrid search for `PGVectorIndex`.
<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->
Closes #3006 
## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->
Run unit tests:
` ./scripts/unit-tests.sh `
Run integration tests for openai vector stores:
1. Export env vars:
```
export ENABLE_PGVECTOR=true
export PGVECTOR_HOST=localhost
export PGVECTOR_PORT=5432
export PGVECTOR_DB=llamastack
export PGVECTOR_USER=llamastack
export PGVECTOR_PASSWORD=llamastack
```
2. Create DB:
```
psql -h localhost -U postgres -c "CREATE ROLE llamastack LOGIN PASSWORD 'llamastack';"
psql -h localhost -U postgres -c "CREATE DATABASE llamastack OWNER llamastack;"
psql -h localhost -U llamastack -d llamastack -c "CREATE EXTENSION IF NOT EXISTS vector;"
```
3. Install sentence-transformers:
` uv pip install sentence-transformers  `
4. Run:
```
uv run --group test pytest -s -v --stack-config="inference=inline::sentence-transformers,vector_io=remote::pgvector" --embedding-model sentence-transformers/all-MiniLM-L6-v2 tests/integration/vector_io/test_openai_vector_stores.py
```
Inspect PGVector vector stores (optional):
```
psql llamastack                                                                                                         
psql (14.18 (Homebrew))
Type "help" for help.
llamastack=# \z
                                                    Access privileges
 Schema |                         Name                         | Type  | Access privileges | Column privileges | Policies 
--------+------------------------------------------------------+-------+-------------------+-------------------+----------
 public | llamastack_kvstore                                   | table |                   |                   | 
 public | metadata_store                                       | table |                   |                   | 
 public | vector_store_pgvector_main                           | table |                   |                   | 
 public | vector_store_vs_1dfbc061_1f4d_4497_9165_ecba2622ba3a | table |                   |                   | 
 public | vector_store_vs_2085a9fb_1822_4e42_a277_c6a685843fa7 | table |                   |                   | 
 public | vector_store_vs_2b3dae46_38be_462a_afd6_37ee5fe661b1 | table |                   |                   | 
 public | vector_store_vs_2f438de6_f606_4561_9d50_ef9160eb9060 | table |                   |                   | 
 public | vector_store_vs_3eeca564_2580_4c68_bfea_83dc57e31214 | table |                   |                   | 
 public | vector_store_vs_53942163_05f3_40e0_83c0_0997c64613da | table |                   |                   | 
 public | vector_store_vs_545bac75_8950_4ff1_b084_e221192d4709 | table |                   |                   | 
 public | vector_store_vs_688a37d8_35b2_4298_a035_bfedf5b21f86 | table |                   |                   | 
 public | vector_store_vs_70624d9a_f6ac_4c42_b8ab_0649473c6600 | table |                   |                   | 
 public | vector_store_vs_73fc1dd2_e942_4972_afb1_1e177b591ac2 | table |                   |                   | 
 public | vector_store_vs_9d464949_d51f_49db_9f87_e033b8b84ac9 | table |                   |                   | 
 public | vector_store_vs_a1e4d724_5162_4d6d_a6c0_bdafaf6b76ec | table |                   |                   | 
 public | vector_store_vs_a328fb1b_1a21_480f_9624_ffaa60fb6672 | table |                   |                   | 
 public | vector_store_vs_a8981bf0_2e66_4445_a267_a8fff442db53 | table |                   |                   | 
 public | vector_store_vs_ccd4b6a4_1efd_4984_ad03_e7ff8eadb296 | table |                   |                   | 
 public | vector_store_vs_cd6420a4_a1fc_4cec_948c_1413a26281c9 | table |                   |                   | 
 public | vector_store_vs_cd709284_e5cf_4a88_aba5_dc76a35364bd | table |                   |                   | 
 public | vector_store_vs_d7a4548e_fbc1_44d7_b2ec_b664417f2a46 | table |                   |                   | 
 public | vector_store_vs_e7f73231_414c_4523_886c_d1174eee836e | table |                   |                   | 
 public | vector_store_vs_ffd53588_819f_47e8_bb9d_954af6f7833d | table |                   |                   | 
(23 rows)
llamastack=# 
```
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com> | ||
|  | e3928e6a29 | feat: Implement hybrid search in Milvus (#2644) 
		
			Some checks failed
		
		
	 Integration Tests (Replay) / discover-tests (push) Successful in 5s Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 6s Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 10s Test External Providers Installed via Module / test-external-providers-from-module (venv) (push) Has been skipped Python Package Build Test / build (3.13) (push) Failing after 6s Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 9s Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 10s SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 15s Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 16s Python Package Build Test / build (3.12) (push) Failing after 10s SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 21s Vector IO Integration Tests / test-matrix (3.13, remote::weaviate) (push) Failing after 7s Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 15s Unit Tests / unit-tests (3.13) (push) Failing after 10s Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 15s Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 12s Vector IO Integration Tests / test-matrix (3.13, remote::qdrant) (push) Failing after 12s Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 8s Integration Tests (Replay) / Integration Tests (, , , client=, vision=) (push) Failing after 8s Unit Tests / unit-tests (3.12) (push) Failing after 19s Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 12s Vector IO Integration Tests / test-matrix (3.12, remote::weaviate) (push) Failing after 11s Vector IO Integration Tests / test-matrix (3.12, remote::qdrant) (push) Failing after 11s Test External API and Providers / test-external (venv) (push) Failing after 21s Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 19s Pre-commit / pre-commit (push) Successful in 57s # What does this PR do?
This PR implements hybrid search for Milvus DB based on the inbuilt
milvus support.
   
    To test:
    ```
pytest tests/unit/providers/vector_io/remote/test_milvus.py -v -s
--tb=long --disable-warnings --asyncio-mode=auto
    ```
Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> | ||
|  | d7cc38e934 | fix: remove async test markers (fix pre-commit) (#2808) # What does this PR do? some async test markers are in the codebase causing pre-commit to fail due to #2744 remove these pytest fixtures ## Test Plan pre-commit passes Signed-off-by: Charlie Doern <cdoern@redhat.com> | ||
|  | 4ae5656c2f | feat: Implement keyword search in milvus (#2231) 
		
			Some checks failed
		
		
	 SqlStore Integration Tests / test-postgres (3.13) (push) Failing after 7s Integration Tests / discover-tests (push) Successful in 8s Integration Auth Tests / test-matrix (oauth2_token) (push) Failing after 10s Test Llama Stack Build / build-custom-container-distribution (push) Failing after 6s Test Llama Stack Build / build-ubi9-container-distribution (push) Failing after 6s Vector IO Integration Tests / test-matrix (3.12, remote::chromadb) (push) Failing after 11s Vector IO Integration Tests / test-matrix (3.13, remote::chromadb) (push) Failing after 9s Vector IO Integration Tests / test-matrix (3.12, remote::pgvector) (push) Failing after 10s Vector IO Integration Tests / test-matrix (3.13, remote::pgvector) (push) Failing after 8s Test Llama Stack Build / generate-matrix (push) Successful in 8s Python Package Build Test / build (3.13) (push) Failing after 6s Unit Tests / unit-tests (3.12) (push) Failing after 6s Unit Tests / unit-tests (3.13) (push) Failing after 6s Vector IO Integration Tests / test-matrix (3.12, inline::milvus) (push) Failing after 13s Vector IO Integration Tests / test-matrix (3.13, inline::milvus) (push) Failing after 12s Vector IO Integration Tests / test-matrix (3.12, inline::sqlite-vec) (push) Failing after 15s Test External Providers / test-external-providers (venv) (push) Failing after 9s Test Llama Stack Build / build-single-provider (push) Failing after 11s Vector IO Integration Tests / test-matrix (3.13, inline::sqlite-vec) (push) Failing after 14s SqlStore Integration Tests / test-postgres (3.12) (push) Failing after 19s Integration Tests / test-matrix (push) Failing after 8s Test Llama Stack Build / build (push) Failing after 5s Python Package Build Test / build (3.12) (push) Failing after 51s Vector IO Integration Tests / test-matrix (3.13, inline::faiss) (push) Failing after 55s Vector IO Integration Tests / test-matrix (3.12, inline::faiss) (push) Failing after 57s Update ReadTheDocs / update-readthedocs (push) Failing after 50s Pre-commit / pre-commit (push) Successful in 2m9s # What does this PR do?
This PR adds the keyword search implementation for Milvus. Along with
the implementation for remote Milvus, the tests require us to start a
Milvus containers locally.
In order to verify the implementation, run:
```
pytest tests/unit/providers/vector_io/remote/test_milvus.py -v -s --tb=short --disable-warnings --asyncio-mode=auto
```
You can also test the changes using the below script:
```
#!/usr/bin/env python3
import asyncio
import os
import uuid
from typing import List
from llama_stack_client import (
    Agent, 
    AgentEventLogger, 
    LlamaStackClient, 
    RAGDocument
)
class MilvusRAGDemo:
    def __init__(self, base_url: str = "http://localhost:8321/"):
        self.client = LlamaStackClient(base_url=base_url)
        self.vector_db_id = f"milvus_rag_demo_{uuid.uuid4().hex[:8]}"
        self.model_id = None
        self.embedding_model_id = None
        self.embedding_dimension = None
        
    def setup_models(self):
        """Get available models and select appropriate ones for LLM and embeddings."""
        models = self.client.models.list()
    
        # Select embedding model
        embedding_models = [m for m in models if m.model_type == "embedding"]
        if not embedding_models:
            raise ValueError("No embedding models found")
        self.embedding_model_id = embedding_models[0].identifier
        self.embedding_dimension = embedding_models[0].metadata["embedding_dimension"]
        
    def register_vector_db(self):
        print(f"Registering Milvus vector database: {self.vector_db_id}")
        
        response = self.client.vector_dbs.register(
            vector_db_id=self.vector_db_id,
            embedding_model=self.embedding_model_id,
            embedding_dimension=self.embedding_dimension,
            provider_id="milvus-remote",  # Use remote Milvus
        )
        print(f"Vector database registered successfully")
        return response
        
    def insert_documents(self):
        """Insert sample documents into the vector database."""
        print("\nInserting sample documents...")
        
        # Sample documents about different topics
        documents = [
            RAGDocument(
                document_id="ai_ml_basics",
                content="""
                Artificial Intelligence (AI) and Machine Learning (ML) are transforming the world.
                AI refers to the simulation of human intelligence in machines, while ML is a subset
                of AI that enables computers to learn and improve from experience without being
                explicitly programmed. Deep learning, a subset of ML, uses neural networks with
                multiple layers to process complex patterns in data.
                
                Key concepts in AI/ML include:
                - Supervised Learning: Training with labeled data
                - Unsupervised Learning: Finding patterns in unlabeled data
                - Reinforcement Learning: Learning through trial and error
                - Neural Networks: Computing systems inspired by biological brains
                """,
                mime_type="text/plain",
                metadata={"topic": "technology", "category": "ai_ml"},
            ),
        ]
        
        # Insert documents with chunking
        self.client.tool_runtime.rag_tool.insert(
            documents=documents,
            vector_db_id=self.vector_db_id,
            chunk_size_in_tokens=200,  # Smaller chunks for better granularity
        )
        print(f"Inserted {len(documents)} documents with chunking")
                
    def test_keyword_search(self):
        """Test keyword-based search using BM25."""
        
        queries = [
            "neural networks",
            "Python frameworks",
            "data cleaning",
        ]
        
        for query in queries:
            response = self.client.vector_io.query(
                vector_db_id=self.vector_db_id,
                query=query,
                params={
                    "mode": "keyword",  # Keyword search
                    "max_chunks": 3,
                    "score_threshold": 0.0,
                }
            )
            
            for i, (chunk, score) in enumerate(zip(response.chunks, response.scores)):
                print(f"  {i+1}. Score: {score:.4f}")
                print(f"     Content: {chunk.content[:100]}...")
                print(f"     Metadata: {chunk.metadata}")    
                
    def run_demo(self):       
        try:
            self.setup_models()
            self.register_vector_db()
            self.insert_documents()
            self.test_keyword_search()
        except Exception as e:
            print(f"Error during demo: {e}")
            raise
def main():
    """Main function to run the demo."""
    # Check if Llama Stack server is running
    demo = MilvusRAGDemo()    
    try:
        demo.run_demo()
    except Exception as e:
        print(f"Demo failed: {e}")
if __name__ == "__main__":
    main()
```
[//]: # (## Documentation)
---------
Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com> |