Merge b866ecddbf into ee107aadd6

2025-12-03 01:48:05 +00:00 · 2025-12-02 09:58:57 +01:00 · 2025-12-02 09:58:57 +01:00 · 71324902fa
commit 71324902fa
parent ee107aadd6 b866ecddbf
1 changed files with 16 additions and 8 deletions
--- a/docs/docs/providers/openai_responses_limitations.mdx
+++ b/docs/docs/providers/openai_responses_limitations.mdx
@ -262,14 +262,6 @@ OpenAI provides a [prompt caching](https://platform.openai.com/docs/guides/promp
 ---
 ### Parallel Tool Calls
 **Status:** Rumored Issue
 There are reports that `parallel_tool_calls` may not work correctly. This needs verification and a ticket should be opened if confirmed.
 ---
 ## Resolved Issues
 The following limitations have been addressed in recent releases:
@ -297,3 +289,19 @@ The `require_approval` parameter for MCP tools in the Responses API now works co
 **Fixed in:** [#3003](https://github.com/llamastack/llama-stack/pull/3003) (Agent API), [#3602](https://github.com/llamastack/llama-stack/pull/3602) (Responses API)
 MCP tools now correctly handle array-type arguments in both the Agent API and Responses API.
 ---
 ### Parallel tool calls
 **Status:** ✅ Resolved
 The [`parallel_tool_calls` parameter](https://platform.openai.com/docs/api-reference/responses/create#responses_create-parallel_tool_calls) controls turn-based function calling workflows, _not_ parallelism or concurrency. See the [related function calling documentation](https://platform.openai.com/docs/guides/function-calling#parallel-function-calling). 
 If `parallel_tool_calls=false`, the intended behavior is that multiple generated functional calls will be executed once per turn until done; the client is responsible for executing them one at a time and returning the result, in the expected format, in order to proceed.
 For example, with a custom tool generation request with a `get_weather` function definition, the input of "What is the weather in Tokyo and New York?" will, by default, cause two function calls to be generated - a `get_weather` function call definition for each of `Paris` and `New York`. With `parallel_tool_calls = false`, however, only one of these will be generated initially; the client is then responsible for executing that function call and appending the results to the message history, after which the conversation will proceed with the model-generated second function tool call definition. 
 | parallel_tool_calls=true | parallel_tool_calls=false |
 |------|-------|
 | <img width="1134" height="1330" alt="Image" src="https://github.com/user-attachments/assets/68b5d6f0-0407-4926-9634-228512aa420d" /> | <img width="1236" height="1868" alt="Image" src="https://github.com/user-attachments/assets/42a1243c-4268-40d0-abcf-ad1bf9abc9c0" /> |