llama-stack-mirror/llama_stack/providers/registry
Ben Browning 8bf1d91d38 feat: Add synthetic-data-kit for file_search doc conversion
This adds a `builtin::document_conversion` tool for converting
documents when used with file_search that uses
meta-llama/synthetic-data-kit. I also have another local
implementation that uses Docling, but need to debug some segfault
issues I'm hitting locally with that so pushing this first as a
simpler reference implementation.

Long-term I think we'll want a remote implemention here as well - like
perhaps docling-serve or unstructured.io - but need to look more into
that.

This passes the existing
`tests/verifications/openai_api/test_responses.py` but doesn't yet add
any new tests for file types besides text and pdf.

Signed-off-by: Ben Browning <bbrownin@redhat.com>
2025-06-27 13:31:38 -04:00
..
__init__.py API Updates (#73) 2024-09-17 19:51:35 -07:00
agents.py feat: add deps dynamically based on metastore config (#2405) 2025-06-05 14:07:25 -07:00
datasetio.py chore: enable pyupgrade fixes (#1806) 2025-05-01 14:23:50 -07:00
eval.py chore: enable pyupgrade fixes (#1806) 2025-05-01 14:23:50 -07:00
files.py feat: reference implementation for files API (#2330) 2025-06-02 21:54:24 -07:00
inference.py chore: isolate bare minimum project dependencies (#2282) 2025-06-26 10:14:27 +02:00
post_training.py feat: add huggingface post_training impl (#2132) 2025-05-16 14:41:28 -07:00
safety.py chore: isolate bare minimum project dependencies (#2282) 2025-06-26 10:14:27 +02:00
scoring.py chore: isolate bare minimum project dependencies (#2282) 2025-06-26 10:14:27 +02:00
telemetry.py chore: enable pyupgrade fixes (#1806) 2025-05-01 14:23:50 -07:00
tool_runtime.py feat: Add synthetic-data-kit for file_search doc conversion 2025-06-27 13:31:38 -04:00
vector_io.py feat: Add synthetic-data-kit for file_search doc conversion 2025-06-27 13:31:38 -04:00