fix: Update Qdrant support post-refactor (#1022)

# What does this PR do? I tried running the Qdrant provider and found some bugs. See #1021 for details. @terrytangyuan wrote there: > Please feel free to submit your changes in a PR. I fixed similar issues for pgvector provider. This might be an issue introduced from a refactoring. So I am submitting this PR. Closes #1021 ## Test Plan Here are the highlights for what I did to test this: References: - https://llama-stack.readthedocs.io/en/latest/getting_started/index.html - https://github.com/meta-llama/llama-stack-apps/blob/main/examples/agents/rag_with_vector_db.py - https://github.com/meta-llama/llama-stack/blob/main/docs/zero_to_hero_guide/README.md#build-configure-and-run-llama-stack Install and run Qdrant server: ``` podman pull qdrant/qdrant mkdir qdrant-data podman run -p 6333:6333 -v $(pwd)/qdrant-data:/qdrant/storage qdrant/qdrant ``` Install and run Llama Stack from the venv-support PR (mainly because I didn't want to install conda): ``` brew install cmake # Should just need this once git clone https://github.com/meta-llama/llama-models.git gh repo clone cdoern/llama-stack cd llama-stack gh pr checkout 1018 # This is the checkout that introduces venv support for build/run. Otherwise you have to use conda. Eventually this wil be part of main, hopefully. uv sync --extra dev uv pip install -e . source .venv/bin/activate uv pip install qdrant_client LLAMA_STACK_DIR=$(pwd) LLAMA_MODELS_DIR=../llama-models llama stack build --template ollama --image-type venv ``` ``` edit llama_stack/templates/ollama/run.yaml ``` in that editor under: ``` vector_io: ``` add: ``` - provider_id: qdrant provider_type: remote::qdrant config: {} ``` see https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/remote/vector_io/qdrant/config.py#L14 for config options (but I didn't need any) ``` LLAMA_STACK_DIR=$(pwd) LLAMA_MODELS_DIR=../llama-models llama stack run ollama --image-type venv \ --port $LLAMA_STACK_PORT \ --env INFERENCE_MODEL=$INFERENCE_MODEL \ --env SAFETY_MODEL=$SAFETY_MODEL \ --env OLLAMA_URL=$OLLAMA_URL ``` Then I tested it out in a notebook. Key highlights included: ``` qdrant_provider = None for provider in client.providers.list(): if provider.api == "vector_io" and provider.provider_id == "qdrant": qdrant_provider = provider qdrant_provider assert qdrant_provider is not None, "QDrant is not a provider. You need to edit the run yaml file you use in your `llama stack run` call" vector_db_id = f"test-vector-db-{uuid.uuid4().hex}" client.vector_dbs.register( vector_db_id=vector_db_id, embedding_model="all-MiniLM-L6-v2", embedding_dimension=384, provider_id=qdrant_provider.provider_id, ) ``` Other than that, I just followed what was in https://llama-stack.readthedocs.io/en/latest/getting_started/index.html It would be good to have automated tests for this in the future, but that would be a big undertaking. Signed-off-by: Bill Murdock <bmurdock@redhat.com>
2025-02-10 18:08:33 -05:00 · 2025-02-10 18:08:33 -05:00 · 3856927ee8
commit 3856927ee8
parent 36d35406a7
2 changed files with 6 additions and 3 deletions
--- a/llama_stack/providers/remote/vector_io/qdrant/init.py
+++ b/llama_stack/providers/remote/vector_io/qdrant/init.py
@ -12,8 +12,8 @@ from .config import QdrantConfig


 async def get_adapter_impl(config: QdrantConfig, deps: Dict[Api, ProviderSpec]):
-    from .qdrant import QdrantVectorMemoryAdapter
+    from .qdrant import QdrantVectorDBAdapter

-    impl = QdrantVectorMemoryAdapter(config, deps[Api.inference])
+    impl = QdrantVectorDBAdapter(config, deps[Api.inference])
    await impl.initialize()
    return impl
--- a/llama_stack/providers/remote/vector_io/qdrant/qdrant.py
+++ b/llama_stack/providers/remote/vector_io/qdrant/qdrant.py
@ -55,7 +55,7 @@ class QdrantIndex(EmbeddingIndex):

        points = []
        for i, (chunk, embedding) in enumerate(zip(chunks, embeddings)):
-            chunk_id = f"{chunk.document_id}:chunk-{i}"
+            chunk_id = f"{chunk.metadata['document_id']}:chunk-{i}"
            points.append(
                PointStruct(
                    id=convert_id(chunk_id),
@ -93,6 +93,9 @@ class QdrantIndex(EmbeddingIndex):

        return QueryChunksResponse(chunks=chunks, scores=scores)

+    async def delete(self):
+        await self.client.delete_collection(collection_name=self.collection_name)
+

 class QdrantVectorDBAdapter(VectorIO, VectorDBsProtocolPrivate):
    def __init__(self, config: QdrantConfig, inference_api: Api.inference) -> None: