mirror of https://github.com/meta-llama/llama-stack.git synced 2025-06-28 10:54:19 +00:00

Daniele Martinoli cca9bd6cc3

# What does this PR do?
Removed local execution option from the remote Qdrant provider and
introduced an explicit inline provider for the embedded execution.
Updated the ollama template to include this option: this part can be
reverted in case we don't want to have two default `vector_io`
providers.

(Closes #1082)

## Test Plan
Build and run an ollama distro:
```bash
llama stack build --template ollama --image-type conda
llama stack run --image-type conda ollama
```

Run one of the sample ingestionapplicatinos like
[rag_with_vector_db.py](https://github.com/meta-llama/llama-stack-apps/blob/main/examples/agents/rag_with_vector_db.py),
but replace this line:
```py
    selected_vector_provider = vector_providers[0]
```
with the following, to use the `qdrant` provider:
```py
    selected_vector_provider = vector_providers[1]
```

After running the test code, verify the timestamp of the Qdrant store:
```bash
% ls -ltr ~/.llama/distributions/ollama/qdrant.db/collection/test_vector_db_*
total 784
-rw-r--r--@ 1 dmartino  staff  401408 Feb 26 10:07 storage.sqlite
```

[//]: # (## Documentation)

---------

Signed-off-by: Daniele Martinoli <dmartino@redhat.com>
Co-authored-by: Francisco Arceo <farceo@redhat.com>

2025-03-18 14:04:21 -07:00

1.6 KiB

Raw Permalink Blame History

orphan
true

Qdrant

Qdrant is an inline and remote vector database provider for Llama Stack. It allows you to store and query vectors directly in memory. That means you'll get fast and efficient vector retrieval.

By default, Qdrant stores vectors in RAM, delivering incredibly fast access for datasets that fit comfortably in memory. But when your dataset exceeds RAM capacity, Qdrant offers Memmap as an alternative.
[An Introduction to Vector Databases](https://qdrant.tech/articles/what-is-a-vector-database/)

Features

Lightweight and easy to use
Fully integrated with Llama Stack
Apache 2.0 license terms
Store embeddings and their metadata
Supports search by Keyword and Hybrid search
Multilingual and Multimodal retrieval
Medatata filtering
GPU support

Usage

To use Qdrant in your Llama Stack project, follow these steps:

Install the necessary dependencies.
Configure your Llama Stack project to use Qdrant.
Start storing and querying vectors.

Installation

You can install Qdrant using docker:

docker pull qdrant/qdrant

Documentation

See the Qdrant documentation for more details about Qdrant in general.

1.6 KiB Raw Permalink Blame History