llama-stack-mirror/llama_stack/providers
Cesare Pompeiano 1c23aeb937
feat: Add vector_db_id to chunk metadata (#3304)
# What does this PR do?

When running RAG in a multi vector DB setting, it can be difficult to
trace where retrieved chunks originate from. This PR adds the
`vector_db_id` into each chunk’s metadata, making it easier to
understand which database a given chunk came from. This is helpful for
debugging and for analyzing retrieval behavior of multiple DBs.

Relevant code:

```python
for vector_db_id, result in zip(vector_db_ids, results):
    for chunk, score in zip(result.chunks, result.scores):
        if not hasattr(chunk, "metadata") or chunk.metadata is None:
            chunk.metadata = {}
        chunk.metadata["vector_db_id"] = vector_db_id

        chunks.append(chunk)
        scores.append(score)
```

## Test Plan

* Ran Llama Stack in debug mode.
* Verified that `vector_db_id` was added to each chunk’s metadata.
* Confirmed that the metadata was printed in the console when using the
RAG tool.

---------

Co-authored-by: are-ces <cpompeia@redhat.com>
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
2025-09-10 11:19:21 +02:00
..
inline feat: Add vector_db_id to chunk metadata (#3304) 2025-09-10 11:19:21 +02:00
registry fix(deps): bump datasets versions for all providers (#3382) 2025-09-08 15:13:42 -07:00
remote chore: update the anthropic inference impl to use openai-python for openai-compat functions (#3366) 2025-09-07 14:00:42 -07:00
utils fix: use lambda pattern for bedrock config env vars (#3307) 2025-09-05 10:45:11 +02:00
__init__.py API Updates (#73) 2024-09-17 19:51:35 -07:00
datatypes.py feat: create unregister shield API endpoint in Llama Stack (#2853) 2025-08-05 07:33:46 -07:00