updating based on feedback

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
This commit is contained in:
Francisco Javier Arceo 2025-03-28 09:19:14 -04:00
parent 1ac05d3a2a
commit 9a014b2822
2 changed files with 25 additions and 9 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

View file

@ -15,16 +15,23 @@ That means you're not limited to storing vectors in memory or in a separate serv
### Comparison to Faiss ### Comparison to Faiss
SQLite-Vec is a lightweight alternative to Faiss, which is a popular vector database provider. The choice between Faiss and sqlite-vec should be made based on the needs of your application,
While Faiss is a fast, lightweight and powerful inline provider, Faiss reindexes the as they have different strengths.
entire database when a new vector is added. SQLite-Vec is a disk-based storage provider
that allows for larger vector storage and handles incremental writes more efficiently.
SQLite-vec is a great alternative to Faiss when you need to execute several writes to the #### Choosing the Right Provider
database.
Scenario | Recommended Tool | Reason
-- |-----------------| --
Online Analytical Processing (OLAP) | Faiss | Fast, in-memory searches
Online Transaction Processing (OLTP) | sqlite-vec | Frequent writes and reads
Frequent writes | sqlite-vec | Efficient disk-based storage and incremental indexing
Large datasets | sqlite-vec | Disk-based storage for larger vector storage
Datasets that can fit in memory, frequent reads | Faiss | Fast in-memory searches, optimized for speed, indexing, and GPU acceleration
#### Empirical Example
Consider the histogram below in which 10,000 randomly generated strings were inserted Consider the histogram below in which 10,000 randomly generated strings were inserted
in batches of 100 into both Faiss and SQLite-vec using `client.tool_runtime.rag_tool.insert()`. in batches of 100 into both Faiss and sqlite-vec using `client.tool_runtime.rag_tool.insert()`.
```{image} ../../../../_static/providers/vector_io/write_time_comparison_sqlite-vec-faiss.png ```{image} ../../../../_static/providers/vector_io/write_time_comparison_sqlite-vec-faiss.png
:alt: Comparison of SQLite-Vec and Faiss write times :alt: Comparison of SQLite-Vec and Faiss write times
@ -35,12 +42,21 @@ You will notice that the average write time for `sqlite-vec` was 788ms, compared
47,640ms for Faiss. While the number is jarring, if you look at the distribution, you can see that it is rather 47,640ms for Faiss. While the number is jarring, if you look at the distribution, you can see that it is rather
uniformly spread across the [1500, 100000] interval. uniformly spread across the [1500, 100000] interval.
Looking at each individual write in the order that the documents are inserted you'll see the increase in
write speed as Faiss reindexes the vectors after each write.
```{image} ../../../../_static/providers/vector_io/write_time_sequence_sqlite-vec-faiss.png ```{image} ../../../../_static/providers/vector_io/write_time_sequence_sqlite-vec-faiss.png
:alt: Comparison of SQLite-Vec and Faiss write times :alt: Comparison of SQLite-Vec and Faiss write times
:width: 400px :width: 400px
``` ```
For more information about this topic see [the GitHub Issue](https://github.com/meta-llama/llama-stack/issues/1165)
where this was discussed. In comparison, the read times for Faiss was on average 10% faster than sqlite-vec.
The modes of the two distributions highlight the differences much further where Faiss
will likely yield faster read performance.
```{image} ../../../../_static/providers/vector_io/read_time_comparison_sqlite-vec-faiss.png
:alt: Comparison of SQLite-Vec and Faiss read times
:width: 400px
```
## Usage ## Usage