llama-stack-mirror/llama_stack
Ben Browning e56690abef feat: Add synthetic-data-kit for file_search doc conversion
This adds a `builtin::document_conversion` tool for converting
documents when used with file_search that uses
meta-llama/synthetic-data-kit. I also have another local
implementation that uses Docling, but need to debug some segfault
issues I'm hitting locally with that so pushing this first as a
simpler reference implementation.

Long-term I think we'll want a remote implemention here as well - like
perhaps docling-serve or unstructured.io - but need to look more into
that.

This passes the existing
`tests/verifications/openai_api/test_responses.py` but doesn't yet add
any new tests for file types besides text and pdf.

Signed-off-by: Ben Browning <bbrownin@redhat.com>
2025-06-20 18:51:26 -04:00
..
apis feat: remove score_threshold constraint (#2479) 2025-06-20 09:17:42 +05:30
cli fix: stack build (#2485) 2025-06-20 15:15:43 -07:00
distribution feat: support auth attributes in inference/responses stores (#2389) 2025-06-20 10:24:45 -07:00
models ci: add python package build test (#2457) 2025-06-19 18:57:32 +05:30
providers feat: Add synthetic-data-kit for file_search doc conversion 2025-06-20 18:51:26 -04:00
strong_typing chore: enable pyupgrade fixes (#1806) 2025-05-01 14:23:50 -07:00
templates feat: Add synthetic-data-kit for file_search doc conversion 2025-06-20 18:51:26 -04:00
ui build: Bump version to 0.2.12 2025-06-20 21:06:17 +00:00
__init__.py export LibraryClient 2024-12-13 12:08:00 -08:00
env.py refactor(test): move tools, evals, datasetio, scoring and post training tests (#1401) 2025-03-04 14:53:47 -08:00
log.py ci: fix external provider test (#2438) 2025-06-12 16:14:32 +02:00
schema_utils.py chore: enable pyupgrade fixes (#1806) 2025-05-01 14:23:50 -07:00