fix: Setting default value for metadata_token_count in case the key is not found (#2199)

# What does this PR do?
If a user has previously serialized data into their vector store without
the `metadata_token_count` in the chunk, the `query` method will fail in
a server error. This fixes that edge case by returning 0 when the key is
not detected. This solution is suboptimal but I think it's better to
understate the token size rather than recalculate it and add unnecessary
complexity to the retrieval code.

[//]: # (If resolving an issue, uncomment and update the line below)
[//]: # (Closes #[issue-number])

## Test Plan
[Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.*]

[//]: # (## Documentation)

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
This commit is contained in:
Francisco Arceo 2025-05-20 06:03:22 -06:00 committed by GitHub
parent 6d20b720b8
commit ed7b4731aa
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -146,7 +146,7 @@ class MemoryToolRuntimeImpl(ToolsProtocolPrivate, ToolRuntime, RAGToolRuntime):
for i, chunk in enumerate(chunks): for i, chunk in enumerate(chunks):
metadata = chunk.metadata metadata = chunk.metadata
tokens += metadata["token_count"] tokens += metadata["token_count"]
tokens += metadata["metadata_token_count"] tokens += metadata.get("metadata_token_count", 0)
if tokens > query_config.max_tokens_in_context: if tokens > query_config.max_tokens_in_context:
log.error( log.error(