This commit is contained in:
Anastas Stoyanovsky 2025-12-02 09:58:57 +01:00 committed by GitHub
commit 71324902fa
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -262,14 +262,6 @@ OpenAI provides a [prompt caching](https://platform.openai.com/docs/guides/promp
--- ---
### Parallel Tool Calls
**Status:** Rumored Issue
There are reports that `parallel_tool_calls` may not work correctly. This needs verification and a ticket should be opened if confirmed.
---
## Resolved Issues ## Resolved Issues
The following limitations have been addressed in recent releases: The following limitations have been addressed in recent releases:
@ -297,3 +289,19 @@ The `require_approval` parameter for MCP tools in the Responses API now works co
**Fixed in:** [#3003](https://github.com/llamastack/llama-stack/pull/3003) (Agent API), [#3602](https://github.com/llamastack/llama-stack/pull/3602) (Responses API) **Fixed in:** [#3003](https://github.com/llamastack/llama-stack/pull/3003) (Agent API), [#3602](https://github.com/llamastack/llama-stack/pull/3602) (Responses API)
MCP tools now correctly handle array-type arguments in both the Agent API and Responses API. MCP tools now correctly handle array-type arguments in both the Agent API and Responses API.
---
### Parallel tool calls
**Status:** ✅ Resolved
The [`parallel_tool_calls` parameter](https://platform.openai.com/docs/api-reference/responses/create#responses_create-parallel_tool_calls) controls turn-based function calling workflows, _not_ parallelism or concurrency. See the [related function calling documentation](https://platform.openai.com/docs/guides/function-calling#parallel-function-calling).
If `parallel_tool_calls=false`, the intended behavior is that multiple generated functional calls will be executed once per turn until done; the client is responsible for executing them one at a time and returning the result, in the expected format, in order to proceed.
For example, with a custom tool generation request with a `get_weather` function definition, the input of "What is the weather in Tokyo and New York?" will, by default, cause two function calls to be generated - a `get_weather` function call definition for each of `Paris` and `New York`. With `parallel_tool_calls = false`, however, only one of these will be generated initially; the client is then responsible for executing that function call and appending the results to the message history, after which the conversation will proceed with the model-generated second function tool call definition.
| parallel_tool_calls=true | parallel_tool_calls=false |
|------|-------|
| <img width="1134" height="1330" alt="Image" src="https://github.com/user-attachments/assets/68b5d6f0-0407-4926-9634-228512aa420d" /> | <img width="1236" height="1868" alt="Image" src="https://github.com/user-attachments/assets/42a1243c-4268-40d0-abcf-ad1bf9abc9c0" /> |