llama-stack-mirror/tests/unit/rag
Cesare Pompeiano fe517f1ac7
feat: Add vector_db_id to chunk metadata (#3304)
# What does this PR do?

When running RAG in a multi vector DB setting, it can be difficult to
trace where retrieved chunks originate from. This PR adds the
`vector_db_id` into each chunk’s metadata, making it easier to
understand which database a given chunk came from. This is helpful for
debugging and for analyzing retrieval behavior of multiple DBs.

Relevant code:

```python
for vector_db_id, result in zip(vector_db_ids, results):
    for chunk, score in zip(result.chunks, result.scores):
        if not hasattr(chunk, "metadata") or chunk.metadata is None:
            chunk.metadata = {}
        chunk.metadata["vector_db_id"] = vector_db_id

        chunks.append(chunk)
        scores.append(score)
```

## Test Plan

* Ran Llama Stack in debug mode.
* Verified that `vector_db_id` was added to each chunk’s metadata.
* Confirmed that the metadata was printed in the console when using the
RAG tool.

---------

Co-authored-by: are-ces <cpompeia@redhat.com>
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
2025-09-10 13:40:27 +02:00
..
fixtures refactor: tests/unittests -> tests/unit; tests/api -> tests/integration 2025-03-04 09:57:00 -08:00
test_rag_query.py feat: Add vector_db_id to chunk metadata (#3304) 2025-09-10 13:40:27 +02:00
test_vector_store.py chore(migrate apis): move VectorDBWithIndex from embeddings to openai_embeddings (#3294) 2025-08-31 14:48:35 -07:00