fix!: remove chunk_id property from Chunk class (#3954)

# What does this PR do?

chunk_id in the Chunk class executes actual logic to compute a chunk ID.
This sort of logic should not live in the API spec.

Instead, the providers should be in charge of calling generate_chunk_id,
and pass it to `Chunk`.

this removes the incorrect dependency between Provider impl and API impl

Signed-off-by: Charlie Doern <cdoern@redhat.com>
This commit is contained in:
Charlie Doern 2025-10-29 21:59:59 -04:00 committed by GitHub
parent 0ef9166c7e
commit e8ecc99524
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
38 changed files with 40679 additions and 135 deletions

View file

@ -49,9 +49,21 @@ def vector_store_id():
@pytest.fixture
def sample_chunks():
from llama_stack.providers.utils.vector_io.vector_utils import generate_chunk_id
return [
Chunk(content="MOCK text content 1", mime_type="text/plain", metadata={"document_id": "mock-doc-1"}),
Chunk(content="MOCK text content 1", mime_type="text/plain", metadata={"document_id": "mock-doc-2"}),
Chunk(
content="MOCK text content 1",
chunk_id=generate_chunk_id("mock-doc-1", "MOCK text content 1"),
mime_type="text/plain",
metadata={"document_id": "mock-doc-1"},
),
Chunk(
content="MOCK text content 1",
chunk_id=generate_chunk_id("mock-doc-2", "MOCK text content 1"),
mime_type="text/plain",
metadata={"document_id": "mock-doc-2"},
),
]