fix: Updating ToolCall.arguments to allow for json strings that can be decoded on client side (#1685)

### What does this PR do? Currently, `ToolCall.arguments` is a `Dict[str, RecursiveType]`. However, on the client SDK side -- the `RecursiveType` gets deserialized into a number ( both int and float get collapsed ) and hence when params are `int` they get converted to float which might break client side tools that might be doing type checking. Closes: https://github.com/meta-llama/llama-stack/issues/1683 ### Test Plan Stainless changes -- https://github.com/meta-llama/llama-stack-client-python/pull/204 ``` pytest -s -v --stack-config=fireworks tests/integration/agents/test_agents.py --text-model meta-llama/Llama-3.1-8B-Instruct ```
2025-03-19 10:36:19 -07:00 · 2025-03-19 10:36:19 -07:00 · 65ca85ba6b
commit 65ca85ba6b
parent 113f3a259c
10 changed files with 137 additions and 110 deletions
--- a/llama_stack/providers/remote/inference/vllm/vllm.py
+++ b/llama_stack/providers/remote/inference/vllm/vllm.py
@ -90,15 +90,12 @@ def _convert_to_vllm_tool_calls_in_response(
    if not tool_calls:
        return []

-    call_function_arguments = None
-    for call in tool_calls:
-        call_function_arguments = json.loads(call.function.arguments)
-
    return [
        ToolCall(
            call_id=call.id,
            tool_name=call.function.name,
-            arguments=call_function_arguments,
+            arguments=json.loads(call.function.arguments),
+            arguments_json=call.function.arguments,
        )
        for call in tool_calls
    ]
@ -183,6 +180,7 @@ async def _process_vllm_chat_completion_stream_response(
                                call_id=tool_call_buf.call_id,
                                tool_name=tool_call_buf.tool_name,
                                arguments=args,
+                                arguments_json=args_str,
                            ),
                            parse_status=ToolCallParseStatus.succeeded,
                        ),