llama-stack-mirror/llama_stack/providers/inline
Ben Browning 8bf1d91d38 feat: Add synthetic-data-kit for file_search doc conversion
This adds a `builtin::document_conversion` tool for converting
documents when used with file_search that uses
meta-llama/synthetic-data-kit. I also have another local
implementation that uses Docling, but need to debug some segfault
issues I'm hitting locally with that so pushing this first as a
simpler reference implementation.

Long-term I think we'll want a remote implemention here as well - like
perhaps docling-serve or unstructured.io - but need to look more into
that.

This passes the existing
`tests/verifications/openai_api/test_responses.py` but doesn't yet add
any new tests for file types besides text and pdf.

Signed-off-by: Ben Browning <bbrownin@redhat.com>
2025-06-27 13:31:38 -04:00
..
agents chore: remove nested imports (#2515) 2025-06-26 08:01:05 +05:30
datasetio chore(refact): move paginate_records fn outside of datasetio (#2137) 2025-05-12 10:56:14 -07:00
eval chore: remove nested imports (#2515) 2025-06-26 08:01:05 +05:30
files/localfs refactor(env)!: enhanced environment variable substitution (#2490) 2025-06-26 08:20:08 +05:30
inference refactor(env)!: enhanced environment variable substitution (#2490) 2025-06-26 08:20:08 +05:30
ios/inference chore: removed executorch submodule (#1265) 2025-02-25 21:57:21 -08:00
post_training ci: add python package build test (#2457) 2025-06-19 18:57:32 +05:30
safety feat: add cpu/cuda config for prompt guard (#2194) 2025-05-28 12:23:15 -07:00
scoring refactor(env)!: enhanced environment variable substitution (#2490) 2025-06-26 08:20:08 +05:30
telemetry fix: fix test of root span to match what is being set (#2494) 2025-06-26 11:41:35 -04:00
tool_runtime feat: Add synthetic-data-kit for file_search doc conversion 2025-06-27 13:31:38 -04:00
vector_io feat: Add synthetic-data-kit for file_search doc conversion 2025-06-27 13:31:38 -04:00
__init__.py impls -> inline, adapters -> remote (#381) 2024-11-06 14:54:05 -08:00