From 01836e79260ab898645303411cbc78b6003cb8ae Mon Sep 17 00:00:00 2001
From: Anastas Stoyanovsky <astoyano@redhat.com>
Date: Fri, 21 Nov 2025 10:28:48 -0500
Subject: [PATCH 1/2] Delete parallel tool calls section from documentation
 because it has now been implemented

Removed section on rumored issue with parallel tool calls.
---
 docs/docs/providers/openai_responses_limitations.mdx | 8 --------
 1 file changed, 8 deletions(-)
diff --git a/docs/docs/providers/openai_responses_limitations.mdx b/docs/docs/providers/openai_responses_limitations.mdx
index 19007438e..1b0349123 100644
--- a/docs/docs/providers/openai_responses_limitations.mdx
+++ b/docs/docs/providers/openai_responses_limitations.mdx
@@ -262,14 +262,6 @@ OpenAI provides a [prompt caching](https://platform.openai.com/docs/guides/promp
 
 ---
 
-### Parallel Tool Calls
-
-**Status:** Rumored Issue
-
-There are reports that `parallel_tool_calls` may not work correctly. This needs verification and a ticket should be opened if confirmed.
-
----
-
 ## Resolved Issues
 
 The following limitations have been addressed in recent releases:

From b866ecddbf8b87d09c353ddbc57f0bc099eb250a Mon Sep 17 00:00:00 2001
From: Anastas Stoyanovsky <astoyano@redhat.com>
Date: Fri, 21 Nov 2025 11:42:40 -0500
Subject: [PATCH 2/2] Document behavior of parallel_tool_calls parameter

Added clarification on the behavior of the `parallel_tool_calls` parameter and its impact on function calling workflows.
---
 .../providers/openai_responses_limitations.mdx   | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/docs/docs/providers/openai_responses_limitations.mdx b/docs/docs/providers/openai_responses_limitations.mdx
index 1b0349123..aeb8dc1e5 100644
--- a/docs/docs/providers/openai_responses_limitations.mdx
+++ b/docs/docs/providers/openai_responses_limitations.mdx
@@ -289,3 +289,19 @@ The `require_approval` parameter for MCP tools in the Responses API now works co
 **Fixed in:** [#3003](https://github.com/llamastack/llama-stack/pull/3003) (Agent API), [#3602](https://github.com/llamastack/llama-stack/pull/3602) (Responses API)
 
 MCP tools now correctly handle array-type arguments in both the Agent API and Responses API.
+
+---
+
+### Parallel tool calls
+
+**Status:** ✅ Resolved
+
+The [`parallel_tool_calls` parameter](https://platform.openai.com/docs/api-reference/responses/create#responses_create-parallel_tool_calls) controls turn-based function calling workflows, _not_ parallelism or concurrency. See the [related function calling documentation](https://platform.openai.com/docs/guides/function-calling#parallel-function-calling). 
+
+If `parallel_tool_calls=false`, the intended behavior is that multiple generated functional calls will be executed once per turn until done; the client is responsible for executing them one at a time and returning the result, in the expected format, in order to proceed.
+
+For example, with a custom tool generation request with a `get_weather` function definition, the input of "What is the weather in Tokyo and New York?" will, by default, cause two function calls to be generated - a `get_weather` function call definition for each of `Paris` and `New York`. With `parallel_tool_calls = false`, however, only one of these will be generated initially; the client is then responsible for executing that function call and appending the results to the message history, after which the conversation will proceed with the model-generated second function tool call definition. 
+
+| parallel_tool_calls=true | parallel_tool_calls=false |
+|------|-------|
+| <img width="1134" height="1330" alt="Image" src="https://github.com/user-attachments/assets/68b5d6f0-0407-4926-9634-228512aa420d" /> | <img width="1236" height="1868" alt="Image" src="https://github.com/user-attachments/assets/42a1243c-4268-40d0-abcf-ad1bf9abc9c0" /> |