llama-stack/docs/source/providers/vector_io/sqlite-vec.md
Varsha e92301f2d7
feat(sqlite-vec): enable keyword search for sqlite-vec (#1439)
# What does this PR do?
This PR introduces support for keyword based FTS5 search with BM25
relevance scoring. It makes changes to the existing EmbeddingIndex base
class in order to support a search_mode and query_str parameter, that
can be used for keyword based search implementations.

[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])

## Test Plan
run 
```
pytest llama_stack/providers/tests/vector_io/test_sqlite_vec.py -v -s --tb=short --disable-warnings --asyncio-mode=auto
```
Output:
```
pytest llama_stack/providers/tests/vector_io/test_sqlite_vec.py -v -s --tb=short --disable-warnings --asyncio-mode=auto
/Users/vnarsing/miniconda3/envs/stack-client/lib/python3.10/site-packages/pytest_asyncio/plugin.py:207: PytestDeprecationWarning: The configuration option "asyncio_default_fixture_loop_scope" is unset.
The event loop scope for asynchronous fixtures will default to the fixture caching scope. Future versions of pytest-asyncio will default the loop scope for asynchronous fixtures to function scope. Set the default fixture loop scope explicitly in order to avoid unexpected behavior in the future. Valid fixture loop scopes are: "function", "class", "module", "package", "session"

  warnings.warn(PytestDeprecationWarning(_DEFAULT_FIXTURE_LOOP_SCOPE_UNSET))
====================================================== test session starts =======================================================
platform darwin -- Python 3.10.16, pytest-8.3.4, pluggy-1.5.0 -- /Users/vnarsing/miniconda3/envs/stack-client/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.10.16', 'Platform': 'macOS-14.7.4-arm64-arm-64bit', 'Packages': {'pytest': '8.3.4', 'pluggy': '1.5.0'}, 'Plugins': {'html': '4.1.1', 'metadata': '3.1.1', 'asyncio': '0.25.3', 'anyio': '4.8.0'}}
rootdir: /Users/vnarsing/go/src/github/meta-llama/llama-stack
configfile: pyproject.toml
plugins: html-4.1.1, metadata-3.1.1, asyncio-0.25.3, anyio-4.8.0
asyncio: mode=auto, asyncio_default_fixture_loop_scope=None
collected 7 items                                                                                                                

llama_stack/providers/tests/vector_io/test_sqlite_vec.py::test_add_chunks PASSED
llama_stack/providers/tests/vector_io/test_sqlite_vec.py::test_query_chunks_vector PASSED
llama_stack/providers/tests/vector_io/test_sqlite_vec.py::test_query_chunks_fts PASSED
llama_stack/providers/tests/vector_io/test_sqlite_vec.py::test_chunk_id_conflict PASSED
llama_stack/providers/tests/vector_io/test_sqlite_vec.py::test_register_vector_db PASSED
llama_stack/providers/tests/vector_io/test_sqlite_vec.py::test_unregister_vector_db PASSED
llama_stack/providers/tests/vector_io/test_sqlite_vec.py::test_generate_chunk_id PASSED
```


For reference, with the implementation, the fts table looks like below:
```
Chunk ID: 9fbc39ce-c729-64a2-260f-c5ec9bb2a33e, Content: Sentence 0 from document 0
Chunk ID: 94062914-3e23-44cf-1e50-9e25821ba882, Content: Sentence 1 from document 0
Chunk ID: e6cfd559-4641-33ba-6ce1-7038226495eb, Content: Sentence 2 from document 0
Chunk ID: 1383af9b-f1f0-f417-4de5-65fe9456cc20, Content: Sentence 3 from document 0
Chunk ID: 2db19b1a-de14-353b-f4e1-085e8463361c, Content: Sentence 4 from document 0
Chunk ID: 9faf986a-f028-7714-068a-1c795e8f2598, Content: Sentence 5 from document 0
Chunk ID: ef593ead-5a4a-392f-7ad8-471a50f033e8, Content: Sentence 6 from document 0
Chunk ID: e161950f-021f-7300-4d05-3166738b94cf, Content: Sentence 7 from document 0
Chunk ID: 90610fc4-67c1-e740-f043-709c5978867a, Content: Sentence 8 from document 0
Chunk ID: 97712879-6fff-98ad-0558-e9f42e6b81d3, Content: Sentence 9 from document 0
Chunk ID: aea70411-51df-61ba-d2f0-cb2b5972c210, Content: Sentence 0 from document 1
Chunk ID: b678a463-7b84-92b8-abb2-27e9a1977e3c, Content: Sentence 1 from document 1
Chunk ID: 27bd63da-909c-1606-a109-75bdb9479882, Content: Sentence 2 from document 1
Chunk ID: a2ad49ad-f9be-5372-e0c7-7b0221d0b53e, Content: Sentence 3 from document 1
Chunk ID: cac53bcd-1965-082a-c0f4-ceee7323fc70, Content: Sentence 4 from document 1
```

Query results:
Result 1: Sentence 5 from document 0
Result 2: Sentence 5 from document 1
Result 3: Sentence 5 from document 2

[//]: # (## Documentation)

---------

Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
2025-05-21 15:24:24 -04:00

3.4 KiB

orphan
true

SQLite-Vec

SQLite-Vec is an inline vector database provider for Llama Stack. It allows you to store and query vectors directly within an SQLite database. That means you're not limited to storing vectors in memory or in a separate service.

Features

  • Lightweight and easy to use
  • Fully integrated with Llama Stacks
  • Uses disk-based storage for persistence, allowing for larger vector storage

Comparison to Faiss

The choice between Faiss and sqlite-vec should be made based on the needs of your application, as they have different strengths.

Choosing the Right Provider

Scenario Recommended Tool Reason
Online Analytical Processing (OLAP) Faiss Fast, in-memory searches
Online Transaction Processing (OLTP) sqlite-vec Frequent writes and reads
Frequent writes sqlite-vec Efficient disk-based storage and incremental indexing
Large datasets sqlite-vec Disk-based storage for larger vector storage
Datasets that can fit in memory, frequent reads Faiss Optimized for speed, indexing, and GPU acceleration

Empirical Example

Consider the histogram below in which 10,000 randomly generated strings were inserted in batches of 100 into both Faiss and sqlite-vec using client.tool_runtime.rag_tool.insert().

:alt: Comparison of SQLite-Vec and Faiss write times
:width: 400px

You will notice that the average write time for sqlite-vec was 788ms, compared to 47,640ms for Faiss. While the number is jarring, if you look at the distribution, you can see that it is rather uniformly spread across the [1500, 100000] interval.

Looking at each individual write in the order that the documents are inserted you'll see the increase in write speed as Faiss reindexes the vectors after each write.

:alt: Comparison of SQLite-Vec and Faiss write times
:width: 400px

In comparison, the read times for Faiss was on average 10% faster than sqlite-vec. The modes of the two distributions highlight the differences much further where Faiss will likely yield faster read performance.

:alt: Comparison of SQLite-Vec and Faiss read times
:width: 400px

Usage

To use sqlite-vec in your Llama Stack project, follow these steps:

  1. Install the necessary dependencies.
  2. Configure your Llama Stack project to use SQLite-Vec.
  3. Start storing and querying vectors.

Supported Search Modes

The sqlite-vec provider supports both vector-based and keyword-based (full-text) search modes.

When using the RAGTool interface, you can specify the desired search behavior via the mode parameter in RAGQueryConfig. For example:

from llama_stack.apis.tool_runtime.rag import RAGQueryConfig

query_config = RAGQueryConfig(max_chunks=6, mode="vector")

results = client.tool_runtime.rag_tool.query(
    vector_db_ids=[vector_db_id],
    content="what is torchtune",
    query_config=query_config,
)

Installation

You can install SQLite-Vec using pip:

pip install sqlite-vec

Documentation

See sqlite-vec's GitHub repo for more details about sqlite-vec in general.