Add pdf support to file_search for Responses API

This adds basic PDF support (using our existing `parse_pdf` function) to the file_search tool and corresponding Vector Files API. When a PDF file is uploaded and attached to a vector store, we parse the pdf and then chunk its content as normal. This is not the best solution long-term, but it does match what we've been doing so far for PDF files in the memory tool. Signed-off-by: Ben Browning <bbrownin@redhat.com>
2025-07-21 20:18:52 +00:00 · 2025-06-11 16:45:28 -04:00 · 2025-06-11 16:45:28 -04:00 · 055885bd5a
commit 055885bd5a
parent 57eccf023d
4 changed files with 41 additions and 33 deletions
--- a/tests/verifications/openai_api/fixtures/pdfs/llama_stack_and_models.pdf
+++ b/tests/verifications/openai_api/fixtures/pdfs/llama_stack_and_models.pdf
--- a/tests/verifications/openai_api/fixtures/test_cases/responses.yaml
+++ b/tests/verifications/openai_api/fixtures/test_cases/responses.yaml
@ -39,7 +39,15 @@ test_response_file_search:
      input: "How many experts does the Llama 4 Maverick model have?"
      tools:
      - type: file_search
-        # vector_store_ids gets added by the test runner
+        # vector_store_ids param for file_search tool gets added by the test runner
+      file_content: "Llama 4 Maverick has 128 experts"
+      output: "128"
+    - case_id: "What is the "
+      input: "How many experts does the Llama 4 Maverick model have?"
+      tools:
+      - type: file_search
+        # vector_store_ids param for file_search toolgets added by the test runner
+      file_path: "pdfs/llama_stack_and_models.pdf"
      output: "128"

 test_response_mcp_tool: