llama-stack-mirror/llama_stack
Cesare Pompeiano 1c23aeb937
feat: Add vector_db_id to chunk metadata (#3304)
# What does this PR do?

When running RAG in a multi vector DB setting, it can be difficult to
trace where retrieved chunks originate from. This PR adds the
`vector_db_id` into each chunk’s metadata, making it easier to
understand which database a given chunk came from. This is helpful for
debugging and for analyzing retrieval behavior of multiple DBs.

Relevant code:

```python
for vector_db_id, result in zip(vector_db_ids, results):
    for chunk, score in zip(result.chunks, result.scores):
        if not hasattr(chunk, "metadata") or chunk.metadata is None:
            chunk.metadata = {}
        chunk.metadata["vector_db_id"] = vector_db_id

        chunks.append(chunk)
        scores.append(score)
```

## Test Plan

* Ran Llama Stack in debug mode.
* Verified that `vector_db_id` was added to each chunk’s metadata.
* Confirmed that the metadata was printed in the console when using the
RAG tool.

---------

Co-authored-by: are-ces <cpompeia@redhat.com>
Co-authored-by: Francisco Arceo <arceofrancisco@gmail.com>
2025-09-10 11:19:21 +02:00
..
apis feat: Adding OpenAI Prompts API (#3319) 2025-09-08 11:05:13 -04:00
cli feat: include a default inference store during llama stack build (#3373) 2025-09-09 15:54:58 -07:00
core chore: remove unused variable (#3389) 2025-09-09 15:51:20 -07:00
distributions fix: Fix locations of distrubution runtime directories (#3336) 2025-09-05 14:09:36 +02:00
models refactor(logging): rename llama_stack logger categories (#3065) 2025-08-21 17:31:04 -07:00
providers feat: Add vector_db_id to chunk metadata (#3304) 2025-09-10 11:19:21 +02:00
strong_typing chore: enable pyupgrade fixes (#1806) 2025-05-01 14:23:50 -07:00
testing fix: environment variable typo in inference recorder error message (#3374) 2025-09-08 17:51:38 +02:00
ui build: Bump version to 0.2.21 2025-09-08 22:30:03 +00:00
__init__.py chore(rename): move llama_stack.distribution to llama_stack.core (#2975) 2025-07-30 23:30:53 -07:00
env.py refactor(test): move tools, evals, datasetio, scoring and post training tests (#1401) 2025-03-04 14:53:47 -08:00
log.py chore(pre-commit): add pre-commit hook to enforce llama_stack logger usage (#3061) 2025-08-20 07:15:35 -04:00
schema_utils.py feat(auth): API access control (#2822) 2025-07-24 15:30:48 -07:00