## Tool Calling


## Creating a Custom Tool and Agent Tool Calling


## Step 1: Import Necessary Packages and Api Keys

In [2]:
import asyncio
import json
import os
from typing import Dict, List

import nest_asyncio
import requests
from dotenv import load_dotenv
from llama_stack_client import LlamaStackClient
from llama_stack_client.lib.agents.agent import Agent
from llama_stack_client.lib.agents.client_tool import ClientTool
from llama_stack_client.lib.agents.event_logger import EventLogger
from llama_stack_client.types import CompletionMessage
from llama_stack_client.types.agent_create_params import AgentConfig
from llama_stack_client.types.shared.tool_response_message import ToolResponseMessage

# Allow asyncio to run in Jupyter Notebook
nest_asyncio.apply()

HOST = "localhost"
PORT = 8321
MODEL_NAME = "meta-llama/Llama-3.2-3B-Instruct"


Create a `.env` file and add you brave api key

`BRAVE_SEARCH_API_KEY = "YOUR_BRAVE_API_KEY_HERE"`

Now load the `.env` file into your jupyter notebook.

In [3]:
load_dotenv()
BRAVE_SEARCH_API_KEY = os.environ["BRAVE_SEARCH_API_KEY"]


## Step 2: Create a class for the Brave Search API integration

Let's create the `BraveSearch` class, which encapsulates the logic for making web search queries using the Brave Search API and formatting the response. The class includes methods for sending requests, processing results, and extracting relevant data to support the integration with an AI toolchain.

In [4]:
class BraveSearch:
    def __init__(self, api_key: str) -> None:
        self.api_key = api_key

    async def search(self, query: str) -> str:
        url = "https://api.search.brave.com/res/v1/web/search"
        headers = {
            "X-Subscription-Token": self.api_key,
            "Accept-Encoding": "gzip",
            "Accept": "application/json",
        }
        payload = {"q": query}
        response = requests.get(url=url, params=payload, headers=headers)
        return json.dumps(self._clean_brave_response(response.json()))

    def _clean_brave_response(self, search_response, top_k=3):
        query = search_response.get("query", {}).get("original", None)
        clean_response = []
        mixed_results = search_response.get("mixed", {}).get("main", [])[:top_k]

        for m in mixed_results:
            r_type = m["type"]
            results = search_response.get(r_type, {}).get("results", [])
            if r_type == "web" and results:
                idx = m["index"]
                selected_keys = ["title", "url", "description"]
                cleaned = {k: v for k, v in results[idx].items() if k in selected_keys}
                clean_response.append(cleaned)

        return {"query": query, "top_k": clean_response}


## Step 3: Create a Custom Tool Class

Here, we defines the `WebSearchTool` class, which extends `ClientTool` to integrate the Brave Search API with Llama Stack, enabling web search capabilities within AI workflows. The class handles incoming user queries, interacts with the `BraveSearch` class for data retrieval, and formats results for effective response generation.

In [5]:
class WebSearchTool(ClientTool):
    def __init__(self, api_key: str):
        self.api_key = api_key
        self.engine = BraveSearch(api_key)

    def get_name(self) -> str:
        return "web_search"

    def get_description(self) -> str:
        return "Search the web for a given query"

    async def run_impl(self, query: str):
        return await self.engine.search(query)

    async def run(self, messages):
        query = None
        for message in messages:
            if isinstance(message, CompletionMessage) and message.tool_calls:
                for tool_call in message.tool_calls:
                    if "query" in tool_call.arguments:
                        query = tool_call.arguments["query"]
                        call_id = tool_call.call_id

        if query:
            search_result = await self.run_impl(query)
            return [
                ToolResponseMessage(
                    call_id=call_id,
                    role="ipython",
                    content=self._format_response_for_agent(search_result),
                    tool_name="brave_search",
                )
            ]

        return [
            ToolResponseMessage(
                call_id="no_call_id",
                role="ipython",
                content="No query provided.",
                tool_name="brave_search",
            )
        ]

    def _format_response_for_agent(self, search_result):
        parsed_result = json.loads(search_result)
        formatted_result = "Search Results with Citations:\n\n"
        for i, result in enumerate(parsed_result.get("top_k", []), start=1):
            formatted_result += (
                f"{i}. {result.get('title', 'No Title')}\n"
                f"   URL: {result.get('url', 'No URL')}\n"
                f"   Description: {result.get('description', 'No Description')}\n\n"
            )
        return formatted_result


## Step 4: Create a function to execute a search query and print the results

Now let's create the `execute_search` function, which initializes the `WebSearchTool`, runs a query asynchronously, and prints the formatted search results for easy viewing.

In [6]:
async def execute_search(query: str):
    web_search_tool = WebSearchTool(api_key=BRAVE_SEARCH_API_KEY)
    result = await web_search_tool.run_impl(query)
    print("Search Results:", result)


## Step 5: Run the search with an example query

In [7]:
query = "Latest developments in quantum computing"
asyncio.run(execute_search(query))


Search Results: {"query": "Latest developments in quantum computing", "top_k": [{"title": "Quantum Computing | Latest News, Photos & Videos | WIRED", "url": "https://www.wired.com/tag/quantum-computing/", "description": "Find the <strong>latest</strong> <strong>Quantum</strong> <strong>Computing</strong> news from WIRED. See related science and technology articles, photos, slideshows and videos."}, {"title": "Quantum Computing News -- ScienceDaily", "url": "https://www.sciencedaily.com/news/matter_energy/quantum_computing/", "description": "<strong>Quantum</strong> <strong>Computing</strong> News. Read the <strong>latest</strong> about the <strong>development</strong> <strong>of</strong> <strong>quantum</strong> <strong>computers</strong>."}]}


## Step 6: Run the search tool using an agent

Here, we setup and execute the `WebSearchTool` within an agent configuration in Llama Stack to handle user queries and generate responses. This involves initializing the client, configuring the agent with tool capabilities, and processing user prompts asynchronously to display results.

In [15]:
async def run_main(disable_safety: bool = False):
    # Initialize the Llama Stack client with the specified base URL
    client = LlamaStackClient(
        base_url=f"http://{HOST}:{PORT}",
    )

    # Configure input and output shields for safety (use "llama_guard" by default)
    input_shields = [] if disable_safety else ["llama_guard"]
    output_shields = [] if disable_safety else ["llama_guard"]

    # Initialize custom tool (ensure `WebSearchTool` is defined earlier in the notebook)
    webSearchTool = WebSearchTool(api_key=BRAVE_SEARCH_API_KEY)

    # Create an agent instance with the client and configuration
    agent = Agent(
        client,
        model=MODEL_NAME,
        instructions="""You are a helpful assistant that responds to user queries with relevant information and cites sources when available.""",
        sampling_params={
            "strategy": {
                "type": "greedy",
            },
        },
        tools=[webSearchTool],
        input_shields=input_shields,
        output_shields=output_shields,
        enable_session_persistence=False,
    )

    # Create a session for interaction and print the session ID
    session_id = agent.create_session("test-session")
    print(f"Created session_id={session_id} for Agent({agent.agent_id})")

    response = agent.create_turn(
        messages=[
            {
                "role": "user",
                "content": """What are the latest developments in quantum computing?""",
            }
        ],
        session_id=session_id,  # Use the created session ID
    )

    # Log and print the response from the agent asynchronously
    async for log in EventLogger().log(response):
        log.print()


# Run the function asynchronously in a Jupyter Notebook cell
await run_main(disable_safety=True)


Created session_id=34d2978d-e299-4a2a-9219-4ffe2fb124a2 for Agent(8a68f2c3-2b2a-4f67-a355-c6d5b2451d6a)
[30m[0m[33minference> [0m[33m[[0m[33mweb[0m[33m_search[0m[33m(query[0m[33m="[0m[33mlatest[0m[33m developments[0m[33m in[0m[33m quantum[0m[33m computing[0m[33m")][0m[97m[0m
[32mCustomTool> Search Results with Citations:

1. Quantum Computing | Latest News, Photos & Videos | WIRED
   URL: https://www.wired.com/tag/quantum-computing/
   Description: Find the <strong>latest</strong> <strong>Quantum</strong> <strong>Computing</strong> news from WIRED. See related science and technology articles, photos, slideshows and videos.

2. Quantum Computing News -- ScienceDaily
   URL: https://www.sciencedaily.com/news/matter_energy/quantum_computing/
   Description: <strong>Quantum</strong> <strong>Computing</strong> News. Read the <strong>latest</strong> about the <strong>development</strong> <strong>of</strong> <strong>quantum</strong> <strong>computers</strong>.

[