feat: Implement the 'max_tool_calls' parameter for the Responses API (#4062)

# Problem
Responses API uses max_tool_calls parameter to limit the number of tool
calls that can be generated in a response. Currently, LLS implementation
of the Responses API does not support this parameter.

# What does this PR do?
This pull request adds the max_tool_calls field to the response object
definition and updates the inline provider. it also ensures that:

- the total number of calls to built-in and mcp tools do not exceed
max_tool_calls
- an error is thrown if max_tool_calls < 1 (behavior seen with the
OpenAI Responses API, but we can change this if needed)

Closes #[3563](https://github.com/llamastack/llama-stack/issues/3563)

## Test Plan
- Tested manually for change in model response w.r.t supplied
max_tool_calls field.
- Added integration tests to test invalid max_tool_calls parameter.
- Added integration tests to check max_tool_calls parameter with
built-in and function tools.
- Added integration tests to check max_tool_calls parameter in the
returned response object.
- Recorded OpenAI Responses API behavior using a sample script:
https://github.com/s-akhtar-baig/llama-stack-examples/blob/main/responses/src/max_tool_calls.py

Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
This commit is contained in:
Shabana Baig 2025-11-10 16:21:27 -05:00 committed by GitHub
parent 209a78b618
commit 433438cfc0
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
9 changed files with 240 additions and 2 deletions

View file

@ -5910,6 +5910,11 @@ components:
type: string
description: >-
(Optional) System message inserted into the model's context
max_tool_calls:
type: integer
description: >-
(Optional) Max number of total calls to built-in tools that can be processed
in a response
input:
type: array
items:
@ -6268,6 +6273,11 @@ components:
(Optional) Additional fields to include in the response.
max_infer_iters:
type: integer
max_tool_calls:
type: integer
description: >-
(Optional) Max number of total calls to built-in tools that can be processed
in a response.
additionalProperties: false
required:
- input
@ -6349,6 +6359,11 @@ components:
type: string
description: >-
(Optional) System message inserted into the model's context
max_tool_calls:
type: integer
description: >-
(Optional) Max number of total calls to built-in tools that can be processed
in a response
additionalProperties: false
required:
- created_at

View file

@ -6626,6 +6626,11 @@ components:
type: string
description: >-
(Optional) System message inserted into the model's context
max_tool_calls:
type: integer
description: >-
(Optional) Max number of total calls to built-in tools that can be processed
in a response
input:
type: array
items:
@ -6984,6 +6989,11 @@ components:
(Optional) Additional fields to include in the response.
max_infer_iters:
type: integer
max_tool_calls:
type: integer
description: >-
(Optional) Max number of total calls to built-in tools that can be processed
in a response.
additionalProperties: false
required:
- input
@ -7065,6 +7075,11 @@ components:
type: string
description: >-
(Optional) System message inserted into the model's context
max_tool_calls:
type: integer
description: >-
(Optional) Max number of total calls to built-in tools that can be processed
in a response
additionalProperties: false
required:
- created_at