llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-27 20:02:01 +00:00

History

Ben Browning 8bf1d91d38 feat: Add synthetic-data-kit for file_search doc conversion This adds a `builtin::document_conversion` tool for converting documents when used with file_search that uses meta-llama/synthetic-data-kit. I also have another local implementation that uses Docling, but need to debug some segfault issues I'm hitting locally with that so pushing this first as a simpler reference implementation. Long-term I think we'll want a remote implemention here as well - like perhaps docling-serve or unstructured.io - but need to look more into that. This passes the existing `tests/verifications/openai_api/test_responses.py` but doesn't yet add any new tests for file types besides text and pdf. Signed-off-by: Ben Browning <bbrownin@redhat.com>		2025-06-27 13:31:38 -04:00
..
__init__.py	API Updates (#73 )	2024-09-17 19:51:35 -07:00
file_utils.py	Update the "InterleavedTextMedia" type (#635 )	2024-12-17 11:18:31 -08:00
openai_vector_store_mixin.py	feat: Add synthetic-data-kit for file_search doc conversion	2025-06-27 13:31:38 -04:00
vector_store.py	feat: Add ChunkMetadata to Chunk (#2497 )	2025-06-25 15:55:23 -04:00