mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-07-30 07:39:38 +00:00
Merge branch 'docs_improvement' of github.com:meta-llama/llama-stack into docs_improvement
This commit is contained in:
commit
ca95afb449
15 changed files with 1366 additions and 443 deletions
|
@ -1,111 +0,0 @@
|
|||
|
||||
# Getting Started with Llama Stack
|
||||
|
||||
This guide will walk you through the steps to set up an end-to-end workflow with Llama Stack. It focuses on building a Llama Stack distribution and starting up a Llama Stack server. See our [documentation](../README.md) for more on Llama Stack's capabilities, or visit [llama-stack-apps](https://github.com/meta-llama/llama-stack-apps/tree/main) for example apps.
|
||||
|
||||
## Installation
|
||||
|
||||
The `llama` CLI tool helps you manage the Llama toolchain & agentic systems. After installing the `llama-stack` package, the `llama` command should be available in your path.
|
||||
|
||||
You can install this repository in two ways:
|
||||
|
||||
1. **Install as a package**:
|
||||
Install directly from [PyPI](https://pypi.org/project/llama-stack/) with:
|
||||
```bash
|
||||
pip install llama-stack
|
||||
```
|
||||
|
||||
2. **Install from source**:
|
||||
Follow these steps to install from the source code:
|
||||
```bash
|
||||
mkdir -p ~/local
|
||||
cd ~/local
|
||||
git clone git@github.com:meta-llama/llama-stack.git
|
||||
|
||||
conda create -n stack python=3.10
|
||||
conda activate stack
|
||||
|
||||
cd llama-stack
|
||||
$CONDA_PREFIX/bin/pip install -e .
|
||||
```
|
||||
|
||||
Refer to the [CLI Reference](./cli_reference.md) for details on Llama CLI commands.
|
||||
|
||||
## Starting Up Llama Stack Server
|
||||
|
||||
There are two ways to start the Llama Stack server:
|
||||
|
||||
1. **Using Docker**:
|
||||
We provide a pre-built Docker image of Llama Stack, available in the [distributions](../distributions/) folder.
|
||||
|
||||
> **Note:** For GPU inference, set environment variables to specify the local directory with your model checkpoints and enable GPU inference.
|
||||
```bash
|
||||
export LLAMA_CHECKPOINT_DIR=~/.llama
|
||||
```
|
||||
Download Llama models with:
|
||||
```
|
||||
llama download --model-id Llama3.1-8B-Instruct
|
||||
```
|
||||
Start a Docker container with:
|
||||
```bash
|
||||
cd llama-stack/distributions/meta-reference-gpu
|
||||
docker run -it -p 5000:5000 -v ~/.llama:/root/.llama -v ./run.yaml:/root/my-run.yaml --gpus=all distribution-meta-reference-gpu --yaml_config /root/my-run.yaml
|
||||
```
|
||||
|
||||
**Tip:** For remote providers, use `docker compose up` with scripts in the [distributions folder](../distributions/).
|
||||
|
||||
2. **Build->Configure->Run via Conda**:
|
||||
For development, build a LlamaStack distribution from scratch.
|
||||
|
||||
**`llama stack build`**
|
||||
Enter build information interactively:
|
||||
```bash
|
||||
llama stack build
|
||||
```
|
||||
|
||||
**`llama stack configure`**
|
||||
Run `llama stack configure <name>` using the name from the build step.
|
||||
```bash
|
||||
llama stack configure my-local-stack
|
||||
```
|
||||
|
||||
**`llama stack run`**
|
||||
Start the server with:
|
||||
```bash
|
||||
llama stack run my-local-stack
|
||||
```
|
||||
|
||||
## Testing with Client
|
||||
|
||||
After setup, test the server with a client:
|
||||
```bash
|
||||
cd /path/to/llama-stack
|
||||
conda activate <env>
|
||||
|
||||
python -m llama_stack.apis.inference.client localhost 5000
|
||||
```
|
||||
|
||||
You can also send a POST request:
|
||||
```bash
|
||||
curl http://localhost:5000/inference/chat_completion \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "Llama3.1-8B-Instruct",
|
||||
"messages": [
|
||||
{"role": "system", "content": "You are a helpful assistant."},
|
||||
{"role": "user", "content": "Write me a 2-sentence poem about the moon"}
|
||||
],
|
||||
"sampling_params": {"temperature": 0.7, "seed": 42, "max_tokens": 512}
|
||||
}'
|
||||
```
|
||||
|
||||
For testing safety, run:
|
||||
```bash
|
||||
python -m llama_stack.apis.safety.client localhost 5000
|
||||
```
|
||||
|
||||
Check our client SDKs for various languages: [Python](https://github.com/meta-llama/llama-stack-client-python), [Node](https://github.com/meta-llama/llama-stack-client-node), [Swift](https://github.com/meta-llama/llama-stack-client-swift), and [Kotlin](https://github.com/meta-llama/llama-stack-client-kotlin).
|
||||
|
||||
## Advanced Guides
|
||||
|
||||
For more on custom Llama Stack distributions, refer to our [Building a Llama Stack Distribution](./building_distro.md) guide.
|
247
docs/zero_to_hero_guide/00_Inference101.ipynb
Normal file
247
docs/zero_to_hero_guide/00_Inference101.ipynb
Normal file
|
@ -0,0 +1,247 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "c1e7571c",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Llama Stack Inference Guide\n",
|
||||
"\n",
|
||||
"This document provides instructions on how to use Llama Stack's `chat_completion` function for generating text using the `Llama3.2-11B-Vision-Instruct` model. Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/).\n",
|
||||
"\n",
|
||||
"### Table of Contents\n",
|
||||
"1. [Quickstart](#quickstart)\n",
|
||||
"2. [Building Effective Prompts](#building-effective-prompts)\n",
|
||||
"3. [Conversation Loop](#conversation-loop)\n",
|
||||
"4. [Conversation History](#conversation-history)\n",
|
||||
"5. [Streaming Responses](#streaming-responses)\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "414301dc",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Quickstart\n",
|
||||
"\n",
|
||||
"This section walks through each step to set up and make a simple text generation request.\n",
|
||||
"\n",
|
||||
"### 1. Set Up the Client\n",
|
||||
"\n",
|
||||
"Begin by importing the necessary components from Llama Stack’s client library:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "7a573752",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from llama_stack_client import LlamaStackClient\n",
|
||||
"from llama_stack_client.types import SystemMessage, UserMessage\n",
|
||||
"\n",
|
||||
"client = LlamaStackClient(base_url='http://localhost:5000')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "86366383",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### 2. Create a Chat Completion Request\n",
|
||||
"\n",
|
||||
"Use the `chat_completion` function to define the conversation context. Each message you include should have a specific role and content:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "77c29dba",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"response = client.inference.chat_completion(\n",
|
||||
" messages=[\n",
|
||||
" SystemMessage(content='You are a friendly assistant.', role='system'),\n",
|
||||
" UserMessage(content='Write a two-sentence poem about llama.', role='user')\n",
|
||||
" ],\n",
|
||||
" model='Llama3.2-11B-Vision-Instruct',\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"print(response.completion_message.content)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "e5f16949",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Building Effective Prompts\n",
|
||||
"\n",
|
||||
"Effective prompt creation (often called 'prompt engineering') is essential for quality responses. Here are best practices for structuring your prompts to get the most out of the Llama Stack model:\n",
|
||||
"\n",
|
||||
"1. **System Messages**: Use `SystemMessage` to set the model's behavior. This is similar to providing top-level instructions for tone, format, or specific behavior.\n",
|
||||
" - **Example**: `SystemMessage(content='You are a friendly assistant that explains complex topics simply.')`\n",
|
||||
"2. **User Messages**: Define the task or question you want to ask the model with a `UserMessage`. The clearer and more direct you are, the better the response.\n",
|
||||
" - **Example**: `UserMessage(content='Explain recursion in programming in simple terms.')`\n",
|
||||
"\n",
|
||||
"### Sample Prompt"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "5c6812da",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"response = client.inference.chat_completion(\n",
|
||||
" messages=[\n",
|
||||
" SystemMessage(content='You are shakespeare.', role='system'),\n",
|
||||
" UserMessage(content='Write a two-sentence poem about llama.', role='user')\n",
|
||||
" ],\n",
|
||||
" model='Llama3.2-11B-Vision-Instruct',\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"print(response.completion_message.content)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "c8690ef0",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Conversation Loop\n",
|
||||
"\n",
|
||||
"To create a continuous conversation loop, where users can input multiple messages in a session, use the following structure. This example runs an asynchronous loop, ending when the user types 'exit,' 'quit,' or 'bye.'"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "02211625",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import asyncio\n",
|
||||
"from llama_stack_client import LlamaStackClient\n",
|
||||
"from llama_stack_client.types import UserMessage\n",
|
||||
"from termcolor import cprint\n",
|
||||
"\n",
|
||||
"client = LlamaStackClient(base_url='http://localhost:5000')\n",
|
||||
"\n",
|
||||
"async def chat_loop():\n",
|
||||
" while True:\n",
|
||||
" user_input = input('User> ')\n",
|
||||
" if user_input.lower() in ['exit', 'quit', 'bye']:\n",
|
||||
" cprint('Ending conversation. Goodbye!', 'yellow')\n",
|
||||
" break\n",
|
||||
"\n",
|
||||
" message = UserMessage(content=user_input, role='user')\n",
|
||||
" response = client.inference.chat_completion(\n",
|
||||
" messages=[message],\n",
|
||||
" model='Llama3.2-11B-Vision-Instruct',\n",
|
||||
" )\n",
|
||||
" cprint(f'> Response: {response.completion_message.content}', 'cyan')\n",
|
||||
"\n",
|
||||
"asyncio.run(chat_loop())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "8cf0d555",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Conversation History\n",
|
||||
"\n",
|
||||
"Maintaining a conversation history allows the model to retain context from previous interactions. Use a list to accumulate messages, enabling continuity throughout the chat session."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "9496f75c",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"async def chat_loop():\n",
|
||||
" conversation_history = []\n",
|
||||
" while True:\n",
|
||||
" user_input = input('User> ')\n",
|
||||
" if user_input.lower() in ['exit', 'quit', 'bye']:\n",
|
||||
" cprint('Ending conversation. Goodbye!', 'yellow')\n",
|
||||
" break\n",
|
||||
"\n",
|
||||
" user_message = UserMessage(content=user_input, role='user')\n",
|
||||
" conversation_history.append(user_message)\n",
|
||||
"\n",
|
||||
" response = client.inference.chat_completion(\n",
|
||||
" messages=conversation_history,\n",
|
||||
" model='Llama3.2-11B-Vision-Instruct',\n",
|
||||
" )\n",
|
||||
" cprint(f'> Response: {response.completion_message.content}', 'cyan')\n",
|
||||
"\n",
|
||||
" assistant_message = UserMessage(content=response.completion_message.content, role='user')\n",
|
||||
" conversation_history.append(assistant_message)\n",
|
||||
"\n",
|
||||
"asyncio.run(chat_loop())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "03fcf5e0",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Streaming Responses\n",
|
||||
"\n",
|
||||
"Llama Stack offers a `stream` parameter in the `chat_completion` function, which allows partial responses to be returned progressively as they are generated. This can enhance user experience by providing immediate feedback without waiting for the entire response to be processed.\n",
|
||||
"\n",
|
||||
"### Example: Streaming Responses"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "d119026e",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import asyncio\n",
|
||||
"from llama_stack_client import LlamaStackClient\n",
|
||||
"from llama_stack_client.lib.inference.event_logger import EventLogger\n",
|
||||
"from llama_stack_client.types import UserMessage\n",
|
||||
"from termcolor import cprint\n",
|
||||
"\n",
|
||||
"async def run_main(stream: bool = True):\n",
|
||||
" client = LlamaStackClient(base_url='http://localhost:5000')\n",
|
||||
"\n",
|
||||
" message = UserMessage(\n",
|
||||
" content='hello world, write me a 2 sentence poem about the moon', role='user'\n",
|
||||
" )\n",
|
||||
" print(f'User>{message.content}', 'green')\n",
|
||||
"\n",
|
||||
" response = client.inference.chat_completion(\n",
|
||||
" messages=[message],\n",
|
||||
" model='Llama3.2-11B-Vision-Instruct',\n",
|
||||
" stream=stream,\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" if not stream:\n",
|
||||
" cprint(f'> Response: {response}', 'cyan')\n",
|
||||
" else:\n",
|
||||
" async for log in EventLogger().log(response):\n",
|
||||
" log.print()\n",
|
||||
"\n",
|
||||
" models_response = client.models.list()\n",
|
||||
" print(models_response)\n",
|
||||
"\n",
|
||||
"if __name__ == '__main__':\n",
|
||||
" asyncio.run(run_main())"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
201
docs/zero_to_hero_guide/00_Local_Cloud_Inference101.ipynb
Normal file
201
docs/zero_to_hero_guide/00_Local_Cloud_Inference101.ipynb
Normal file
|
@ -0,0 +1,201 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "a0ed972d",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Switching between Local and Cloud Model with Llama Stack\n",
|
||||
"\n",
|
||||
"This guide provides a streamlined setup to switch between local and cloud clients for text generation with Llama Stack’s `chat_completion` API. This setup enables automatic fallback to a cloud instance if the local client is unavailable.\n",
|
||||
"\n",
|
||||
"### Pre-requisite\n",
|
||||
"Before you begin, please ensure Llama Stack is installed and the distribution is set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/). You will need to run two distributions, a local and a cloud distribution, for this demo to work.\n",
|
||||
"\n",
|
||||
"### Implementation"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "df89cff7",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### 1. Set Up Local and Cloud Clients\n",
|
||||
"\n",
|
||||
"Initialize both clients, specifying the `base_url` for each instance. In this case, we have the local distribution running on `http://localhost:5000` and the cloud distribution running on `http://localhost:5001`.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "7f868dfe",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from llama_stack_client import LlamaStackClient\n",
|
||||
"\n",
|
||||
"# Configure local and cloud clients\n",
|
||||
"local_client = LlamaStackClient(base_url='http://localhost:5000')\n",
|
||||
"cloud_client = LlamaStackClient(base_url='http://localhost:5001')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "894689c1",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### 2. Client Selection with Fallback\n",
|
||||
"\n",
|
||||
"The `select_client` function checks if the local client is available using a lightweight `/health` check. If the local client is unavailable, it automatically switches to the cloud client.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "ff0c8277",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import httpx\n",
|
||||
"from termcolor import cprint\n",
|
||||
"\n",
|
||||
"async def select_client() -> LlamaStackClient:\n",
|
||||
" \"\"\"Use local client if available; otherwise, switch to cloud client.\"\"\"\n",
|
||||
" try:\n",
|
||||
" async with httpx.AsyncClient() as http_client:\n",
|
||||
" response = await http_client.get(f'{local_client.base_url}/health')\n",
|
||||
" if response.status_code == 200:\n",
|
||||
" cprint('Using local client.', 'yellow')\n",
|
||||
" return local_client\n",
|
||||
" except httpx.RequestError:\n",
|
||||
" pass\n",
|
||||
" cprint('Local client unavailable. Switching to cloud client.', 'yellow')\n",
|
||||
" return cloud_client"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "9ccfe66f",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### 3. Generate a Response\n",
|
||||
"\n",
|
||||
"After selecting the client, you can generate text using `chat_completion`. This example sends a sample prompt to the model and prints the response.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "5e19cc20",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from llama_stack_client.types import UserMessage\n",
|
||||
"\n",
|
||||
"async def get_llama_response(stream: bool = True):\n",
|
||||
" client = await select_client() # Selects the available client\n",
|
||||
" message = UserMessage(content='hello world, write me a 2 sentence poem about the moon', role='user')\n",
|
||||
" cprint(f'User> {message.content}', 'green')\n",
|
||||
"\n",
|
||||
" response = client.inference.chat_completion(\n",
|
||||
" messages=[message],\n",
|
||||
" model='Llama3.2-11B-Vision-Instruct',\n",
|
||||
" stream=stream,\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" if not stream:\n",
|
||||
" cprint(f'> Response: {response}', 'cyan')\n",
|
||||
" else:\n",
|
||||
" # Stream tokens progressively\n",
|
||||
" async for log in EventLogger().log(response):\n",
|
||||
" log.print()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "6edf5e57",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### 4. Run the Asynchronous Response Generation\n",
|
||||
"\n",
|
||||
"Use `asyncio.run()` to execute `get_llama_response` in an asynchronous event loop.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "c10f487e",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import asyncio\n",
|
||||
"\n",
|
||||
"# Initiate the response generation process\n",
|
||||
"asyncio.run(get_llama_response())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "56aa9a09",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Complete code\n",
|
||||
"Summing it up, here's the complete code for local-cloud model implementation with Llama Stack:\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"id": "d9fd74ff",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import asyncio\n",
|
||||
"import httpx\n",
|
||||
"from llama_stack_client import LlamaStackClient\n",
|
||||
"from llama_stack_client.lib.inference.event_logger import EventLogger\n",
|
||||
"from llama_stack_client.types import UserMessage\n",
|
||||
"from termcolor import cprint\n",
|
||||
"\n",
|
||||
"local_client = LlamaStackClient(base_url='http://localhost:5000')\n",
|
||||
"cloud_client = LlamaStackClient(base_url='http://localhost:5001')\n",
|
||||
"\n",
|
||||
"async def select_client() -> LlamaStackClient:\n",
|
||||
" try:\n",
|
||||
" async with httpx.AsyncClient() as http_client:\n",
|
||||
" response = await http_client.get(f'{local_client.base_url}/health')\n",
|
||||
" if response.status_code == 200:\n",
|
||||
" cprint('Using local client.', 'yellow')\n",
|
||||
" return local_client\n",
|
||||
" except httpx.RequestError:\n",
|
||||
" pass\n",
|
||||
" cprint('Local client unavailable. Switching to cloud client.', 'yellow')\n",
|
||||
" return cloud_client\n",
|
||||
"\n",
|
||||
"async def get_llama_response(stream: bool = True):\n",
|
||||
" client = await select_client()\n",
|
||||
" message = UserMessage(\n",
|
||||
" content='hello world, write me a 2 sentence poem about the moon', role='user'\n",
|
||||
" )\n",
|
||||
" cprint(f'User> {message.content}', 'green')\n",
|
||||
"\n",
|
||||
" response = client.inference.chat_completion(\n",
|
||||
" messages=[message],\n",
|
||||
" model='Llama3.2-11B-Vision-Instruct',\n",
|
||||
" stream=stream,\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" if not stream:\n",
|
||||
" cprint(f'> Response: {response}', 'cyan')\n",
|
||||
" else:\n",
|
||||
" async for log in EventLogger().log(response):\n",
|
||||
" log.print()\n",
|
||||
"\n",
|
||||
"asyncio.run(get_llama_response())"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 5
|
||||
}
|
|
@ -1,318 +0,0 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Tool Calling"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"In this section, we'll explore how to enhance your applications with tool calling capabilities. We'll cover:\n",
|
||||
"1. Setting up and using the Brave Search API\n",
|
||||
"2. Creating custom tools\n",
|
||||
"3. Configuring tool prompts and safety settings"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Requirement already satisfied: llama-stack-client in ./.conda/envs/quick/lib/python3.13/site-packages (0.0.48)\n",
|
||||
"Requirement already satisfied: anyio<5,>=3.5.0 in ./.conda/envs/quick/lib/python3.13/site-packages (from llama-stack-client) (4.6.2.post1)\n",
|
||||
"Requirement already satisfied: distro<2,>=1.7.0 in ./.conda/envs/quick/lib/python3.13/site-packages (from llama-stack-client) (1.9.0)\n",
|
||||
"Requirement already satisfied: httpx<1,>=0.23.0 in ./.conda/envs/quick/lib/python3.13/site-packages (from llama-stack-client) (0.27.2)\n",
|
||||
"Requirement already satisfied: pydantic<3,>=1.9.0 in ./.conda/envs/quick/lib/python3.13/site-packages (from llama-stack-client) (2.9.2)\n",
|
||||
"Requirement already satisfied: sniffio in ./.conda/envs/quick/lib/python3.13/site-packages (from llama-stack-client) (1.3.1)\n",
|
||||
"Requirement already satisfied: tabulate>=0.9.0 in ./.conda/envs/quick/lib/python3.13/site-packages (from llama-stack-client) (0.9.0)\n",
|
||||
"Requirement already satisfied: typing-extensions<5,>=4.7 in ./.conda/envs/quick/lib/python3.13/site-packages (from llama-stack-client) (4.12.2)\n",
|
||||
"Requirement already satisfied: idna>=2.8 in ./.conda/envs/quick/lib/python3.13/site-packages (from anyio<5,>=3.5.0->llama-stack-client) (3.10)\n",
|
||||
"Requirement already satisfied: certifi in ./.conda/envs/quick/lib/python3.13/site-packages (from httpx<1,>=0.23.0->llama-stack-client) (2024.8.30)\n",
|
||||
"Requirement already satisfied: httpcore==1.* in ./.conda/envs/quick/lib/python3.13/site-packages (from httpx<1,>=0.23.0->llama-stack-client) (1.0.6)\n",
|
||||
"Requirement already satisfied: h11<0.15,>=0.13 in ./.conda/envs/quick/lib/python3.13/site-packages (from httpcore==1.*->httpx<1,>=0.23.0->llama-stack-client) (0.14.0)\n",
|
||||
"Requirement already satisfied: annotated-types>=0.6.0 in ./.conda/envs/quick/lib/python3.13/site-packages (from pydantic<3,>=1.9.0->llama-stack-client) (0.7.0)\n",
|
||||
"Requirement already satisfied: pydantic-core==2.23.4 in ./.conda/envs/quick/lib/python3.13/site-packages (from pydantic<3,>=1.9.0->llama-stack-client) (2.23.4)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"!pip install llama-stack-client --upgrade"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"ename": "NameError",
|
||||
"evalue": "name 'Agent' is not defined",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||||
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
|
||||
"Cell \u001b[0;32mIn[4], line 23\u001b[0m\n\u001b[1;32m 15\u001b[0m load_dotenv()\n\u001b[1;32m 17\u001b[0m \u001b[38;5;66;03m# Helper function to create an agent with tools\u001b[39;00m\n\u001b[1;32m 18\u001b[0m \u001b[38;5;28;01masync\u001b[39;00m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mcreate_tool_agent\u001b[39m(\n\u001b[1;32m 19\u001b[0m client: LlamaStackClient,\n\u001b[1;32m 20\u001b[0m tools: List[Dict],\n\u001b[1;32m 21\u001b[0m instructions: \u001b[38;5;28mstr\u001b[39m \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mYou are a helpful assistant\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[1;32m 22\u001b[0m model: \u001b[38;5;28mstr\u001b[39m \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mLlama3.1-8B-Instruct\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[0;32m---> 23\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m \u001b[43mAgent\u001b[49m:\n\u001b[1;32m 24\u001b[0m \u001b[38;5;250m \u001b[39m\u001b[38;5;124;03m\"\"\"Create an agent with specified tools.\"\"\"\u001b[39;00m\n\u001b[1;32m 25\u001b[0m agent_config \u001b[38;5;241m=\u001b[39m AgentConfig(\n\u001b[1;32m 26\u001b[0m model\u001b[38;5;241m=\u001b[39mmodel,\n\u001b[1;32m 27\u001b[0m instructions\u001b[38;5;241m=\u001b[39minstructions,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 38\u001b[0m enable_session_persistence\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mTrue\u001b[39;00m,\n\u001b[1;32m 39\u001b[0m )\n",
|
||||
"\u001b[0;31mNameError\u001b[0m: name 'Agent' is not defined"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import asyncio\n",
|
||||
"import os\n",
|
||||
"from typing import Dict, List, Optional\n",
|
||||
"from dotenv import load_dotenv\n",
|
||||
"\n",
|
||||
"from llama_stack_client import LlamaStackClient\n",
|
||||
"#from llama_stack_client.lib.agents.agent import Agent\n",
|
||||
"from llama_stack_client.lib.agents.event_logger import EventLogger\n",
|
||||
"from llama_stack_client.types.agent_create_params import (\n",
|
||||
" AgentConfig,\n",
|
||||
" AgentConfigToolSearchToolDefinition,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Load environment variables\n",
|
||||
"load_dotenv()\n",
|
||||
"\n",
|
||||
"# Helper function to create an agent with tools\n",
|
||||
"async def create_tool_agent(\n",
|
||||
" client: LlamaStackClient,\n",
|
||||
" tools: List[Dict],\n",
|
||||
" instructions: str = \"You are a helpful assistant\",\n",
|
||||
" model: str = \"Llama3.1-8B-Instruct\",\n",
|
||||
") -> Agent:\n",
|
||||
" \"\"\"Create an agent with specified tools.\"\"\"\n",
|
||||
" agent_config = AgentConfig(\n",
|
||||
" model=model,\n",
|
||||
" instructions=instructions,\n",
|
||||
" sampling_params={\n",
|
||||
" \"strategy\": \"greedy\",\n",
|
||||
" \"temperature\": 1.0,\n",
|
||||
" \"top_p\": 0.9,\n",
|
||||
" },\n",
|
||||
" tools=tools,\n",
|
||||
" tool_choice=\"auto\",\n",
|
||||
" tool_prompt_format=\"json\",\n",
|
||||
" input_shields=[\"llama_guard\"],\n",
|
||||
" output_shields=[\"llama_guard\"],\n",
|
||||
" enable_session_persistence=True,\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" return Agent(client, agent_config)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"First, create a `.env` file in your notebook directory with your Brave Search API key:\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"BRAVE_SEARCH_API_KEY=your_key_here\n",
|
||||
"```\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"async def create_search_agent(client: LlamaStackClient) -> Agent:\n",
|
||||
" \"\"\"Create an agent with Brave Search capability.\"\"\"\n",
|
||||
" search_tool = AgentConfigToolSearchToolDefinition(\n",
|
||||
" type=\"brave_search\",\n",
|
||||
" engine=\"brave\",\n",
|
||||
" api_key=os.getenv(\"BRAVE_SEARCH_API_KEY\"),\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" return await create_tool_agent(\n",
|
||||
" client=client,\n",
|
||||
" tools=[search_tool],\n",
|
||||
" instructions=\"\"\"\n",
|
||||
" You are a research assistant that can search the web.\n",
|
||||
" Always cite your sources with URLs when providing information.\n",
|
||||
" Format your responses as:\n",
|
||||
"\n",
|
||||
" FINDINGS:\n",
|
||||
" [Your summary here]\n",
|
||||
"\n",
|
||||
" SOURCES:\n",
|
||||
" - [Source title](URL)\n",
|
||||
" \"\"\"\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
"# Example usage\n",
|
||||
"async def search_example():\n",
|
||||
" client = LlamaStackClient(base_url=\"http://localhost:8000\")\n",
|
||||
" agent = await create_search_agent(client)\n",
|
||||
"\n",
|
||||
" # Create a session\n",
|
||||
" session_id = agent.create_session(\"search-session\")\n",
|
||||
"\n",
|
||||
" # Example queries\n",
|
||||
" queries = [\n",
|
||||
" \"What are the latest developments in quantum computing?\",\n",
|
||||
" \"Who won the most recent Super Bowl?\",\n",
|
||||
" ]\n",
|
||||
"\n",
|
||||
" for query in queries:\n",
|
||||
" print(f\"\\nQuery: {query}\")\n",
|
||||
" print(\"-\" * 50)\n",
|
||||
"\n",
|
||||
" response = agent.create_turn(\n",
|
||||
" messages=[{\"role\": \"user\", \"content\": query}],\n",
|
||||
" session_id=session_id,\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" async for log in EventLogger().log(response):\n",
|
||||
" log.print()\n",
|
||||
"\n",
|
||||
"# Run the example (in Jupyter, use asyncio.run())\n",
|
||||
"await search_example()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3. Custom Tool Creation\n",
|
||||
"\n",
|
||||
"Let's create a custom weather tool:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from typing import TypedDict, Optional\n",
|
||||
"from datetime import datetime\n",
|
||||
"\n",
|
||||
"# Define tool types\n",
|
||||
"class WeatherInput(TypedDict):\n",
|
||||
" location: str\n",
|
||||
" date: Optional[str]\n",
|
||||
"\n",
|
||||
"class WeatherOutput(TypedDict):\n",
|
||||
" temperature: float\n",
|
||||
" conditions: str\n",
|
||||
" humidity: float\n",
|
||||
"\n",
|
||||
"class WeatherTool:\n",
|
||||
" \"\"\"Example custom tool for weather information.\"\"\"\n",
|
||||
"\n",
|
||||
" def __init__(self, api_key: Optional[str] = None):\n",
|
||||
" self.api_key = api_key\n",
|
||||
"\n",
|
||||
" async def get_weather(self, location: str, date: Optional[str] = None) -> WeatherOutput:\n",
|
||||
" \"\"\"Simulate getting weather data (replace with actual API call).\"\"\"\n",
|
||||
" # Mock implementation\n",
|
||||
" return {\n",
|
||||
" \"temperature\": 72.5,\n",
|
||||
" \"conditions\": \"partly cloudy\",\n",
|
||||
" \"humidity\": 65.0\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
" async def __call__(self, input_data: WeatherInput) -> WeatherOutput:\n",
|
||||
" \"\"\"Make the tool callable with structured input.\"\"\"\n",
|
||||
" return await self.get_weather(\n",
|
||||
" location=input_data[\"location\"],\n",
|
||||
" date=input_data.get(\"date\")\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
"async def create_weather_agent(client: LlamaStackClient) -> Agent:\n",
|
||||
" \"\"\"Create an agent with weather tool capability.\"\"\"\n",
|
||||
" weather_tool = {\n",
|
||||
" \"type\": \"function\",\n",
|
||||
" \"function\": {\n",
|
||||
" \"name\": \"get_weather\",\n",
|
||||
" \"description\": \"Get weather information for a location\",\n",
|
||||
" \"parameters\": {\n",
|
||||
" \"type\": \"object\",\n",
|
||||
" \"properties\": {\n",
|
||||
" \"location\": {\n",
|
||||
" \"type\": \"string\",\n",
|
||||
" \"description\": \"City or location name\"\n",
|
||||
" },\n",
|
||||
" \"date\": {\n",
|
||||
" \"type\": \"string\",\n",
|
||||
" \"description\": \"Optional date (YYYY-MM-DD)\",\n",
|
||||
" \"format\": \"date\"\n",
|
||||
" }\n",
|
||||
" },\n",
|
||||
" \"required\": [\"location\"]\n",
|
||||
" }\n",
|
||||
" },\n",
|
||||
" \"implementation\": WeatherTool()\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
" return await create_tool_agent(\n",
|
||||
" client=client,\n",
|
||||
" tools=[weather_tool],\n",
|
||||
" instructions=\"\"\"\n",
|
||||
" You are a weather assistant that can provide weather information.\n",
|
||||
" Always specify the location clearly in your responses.\n",
|
||||
" Include both temperature and conditions in your summaries.\n",
|
||||
" \"\"\"\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
"# Example usage\n",
|
||||
"async def weather_example():\n",
|
||||
" client = LlamaStackClient(base_url=\"http://localhost:8000\")\n",
|
||||
" agent = await create_weather_agent(client)\n",
|
||||
"\n",
|
||||
" session_id = agent.create_session(\"weather-session\")\n",
|
||||
"\n",
|
||||
" queries = [\n",
|
||||
" \"What's the weather like in San Francisco?\",\n",
|
||||
" \"Tell me the weather in Tokyo tomorrow\",\n",
|
||||
" ]\n",
|
||||
"\n",
|
||||
" for query in queries:\n",
|
||||
" print(f\"\\nQuery: {query}\")\n",
|
||||
" print(\"-\" * 50)\n",
|
||||
"\n",
|
||||
" response = agent.create_turn(\n",
|
||||
" messages=[{\"role\": \"user\", \"content\": query}],\n",
|
||||
" session_id=session_id,\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" async for log in EventLogger().log(response):\n",
|
||||
" log.print()\n",
|
||||
"\n",
|
||||
"# Run the example\n",
|
||||
"await weather_example()"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.13.0"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 4
|
||||
}
|
349
docs/zero_to_hero_guide/03_Tool_Calling101.ipynb
Normal file
349
docs/zero_to_hero_guide/03_Tool_Calling101.ipynb
Normal file
|
@ -0,0 +1,349 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Tool Calling"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"In this section, we'll explore how to enhance your applications with tool calling capabilities. We'll cover:\n",
|
||||
"1. Setting up and using the Brave Search API\n",
|
||||
"2. Creating custom tools\n",
|
||||
"3. Configuring tool prompts and safety settings"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 22,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import asyncio\n",
|
||||
"import os\n",
|
||||
"from typing import Dict, List, Optional\n",
|
||||
"from dotenv import load_dotenv\n",
|
||||
"\n",
|
||||
"from llama_stack_client import LlamaStackClient\n",
|
||||
"from llama_stack_client.lib.agents.agent import Agent\n",
|
||||
"from llama_stack_client.lib.agents.event_logger import EventLogger\n",
|
||||
"from llama_stack_client.types.agent_create_params import (\n",
|
||||
" AgentConfig,\n",
|
||||
" AgentConfigToolSearchToolDefinition,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# Load environment variables\n",
|
||||
"load_dotenv()\n",
|
||||
"\n",
|
||||
"# Helper function to create an agent with tools\n",
|
||||
"async def create_tool_agent(\n",
|
||||
" client: LlamaStackClient,\n",
|
||||
" tools: List[Dict],\n",
|
||||
" instructions: str = \"You are a helpful assistant\",\n",
|
||||
" model: str = \"Llama3.1-8B-Instruct\",\n",
|
||||
") -> Agent:\n",
|
||||
" \"\"\"Create an agent with specified tools.\"\"\"\n",
|
||||
" agent_config = AgentConfig(\n",
|
||||
" model=model,\n",
|
||||
" instructions=instructions,\n",
|
||||
" sampling_params={\n",
|
||||
" \"strategy\": \"greedy\",\n",
|
||||
" \"temperature\": 1.0,\n",
|
||||
" \"top_p\": 0.9,\n",
|
||||
" },\n",
|
||||
" tools=tools,\n",
|
||||
" tool_choice=\"auto\",\n",
|
||||
" tool_prompt_format=\"json\",\n",
|
||||
" enable_session_persistence=True,\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" return Agent(client, agent_config)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"First, create a `.env` file in your notebook directory with your Brave Search API key:\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"BRAVE_SEARCH_API_KEY=your_key_here\n",
|
||||
"```\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 19,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\n",
|
||||
"Query: What are the latest developments in quantum computing?\n",
|
||||
"--------------------------------------------------\n",
|
||||
"\u001b[30m\u001b[0m\u001b[33minference> \u001b[0m\u001b[33mF\u001b[0m\u001b[33mIND\u001b[0m\u001b[33mINGS\u001b[0m\u001b[33m:\n",
|
||||
"\u001b[0m\u001b[33mThe\u001b[0m\u001b[33m latest\u001b[0m\u001b[33m developments\u001b[0m\u001b[33m in\u001b[0m\u001b[33m quantum\u001b[0m\u001b[33m computing\u001b[0m\u001b[33m include\u001b[0m\u001b[33m advancements\u001b[0m\u001b[33m in\u001b[0m\u001b[33m quantum\u001b[0m\u001b[33m processors\u001b[0m\u001b[33m,\u001b[0m\u001b[33m quantum\u001b[0m\u001b[33m algorithms\u001b[0m\u001b[33m,\u001b[0m\u001b[33m and\u001b[0m\u001b[33m quantum\u001b[0m\u001b[33m error\u001b[0m\u001b[33m correction\u001b[0m\u001b[33m.\u001b[0m\u001b[33m Researchers\u001b[0m\u001b[33m have\u001b[0m\u001b[33m made\u001b[0m\u001b[33m significant\u001b[0m\u001b[33m progress\u001b[0m\u001b[33m in\u001b[0m\u001b[33m developing\u001b[0m\u001b[33m more\u001b[0m\u001b[33m powerful\u001b[0m\u001b[33m and\u001b[0m\u001b[33m reliable\u001b[0m\u001b[33m quantum\u001b[0m\u001b[33m computers\u001b[0m\u001b[33m,\u001b[0m\u001b[33m with\u001b[0m\u001b[33m some\u001b[0m\u001b[33m companies\u001b[0m\u001b[33m already\u001b[0m\u001b[33m showcasing\u001b[0m\u001b[33m \u001b[0m\u001b[33m100\u001b[0m\u001b[33m-q\u001b[0m\u001b[33mubit\u001b[0m\u001b[33m and\u001b[0m\u001b[33m \u001b[0m\u001b[33m127\u001b[0m\u001b[33m-q\u001b[0m\u001b[33mubit\u001b[0m\u001b[33m quantum\u001b[0m\u001b[33m processors\u001b[0m\u001b[33m (\u001b[0m\u001b[33mIBM\u001b[0m\u001b[33m,\u001b[0m\u001b[33m \u001b[0m\u001b[33m202\u001b[0m\u001b[33m2\u001b[0m\u001b[33m;\u001b[0m\u001b[33m Google\u001b[0m\u001b[33m,\u001b[0m\u001b[33m \u001b[0m\u001b[33m202\u001b[0m\u001b[33m2\u001b[0m\u001b[33m).\u001b[0m\u001b[33m These\u001b[0m\u001b[33m advancements\u001b[0m\u001b[33m have\u001b[0m\u001b[33m led\u001b[0m\u001b[33m to\u001b[0m\u001b[33m breakthrough\u001b[0m\u001b[33ms\u001b[0m\u001b[33m in\u001b[0m\u001b[33m quantum\u001b[0m\u001b[33m simulations\u001b[0m\u001b[33m,\u001b[0m\u001b[33m machine\u001b[0m\u001b[33m learning\u001b[0m\u001b[33m,\u001b[0m\u001b[33m and\u001b[0m\u001b[33m optimization\u001b[0m\u001b[33m problems\u001b[0m\u001b[33m (\u001b[0m\u001b[33mB\u001b[0m\u001b[33mhart\u001b[0m\u001b[33mi\u001b[0m\u001b[33m,\u001b[0m\u001b[33m \u001b[0m\u001b[33m202\u001b[0m\u001b[33m2\u001b[0m\u001b[33m;\u001b[0m\u001b[33m Zhang\u001b[0m\u001b[33m,\u001b[0m\u001b[33m \u001b[0m\u001b[33m202\u001b[0m\u001b[33m2\u001b[0m\u001b[33m).\u001b[0m\u001b[33m Additionally\u001b[0m\u001b[33m,\u001b[0m\u001b[33m there\u001b[0m\u001b[33m have\u001b[0m\u001b[33m been\u001b[0m\u001b[33m significant\u001b[0m\u001b[33m improvements\u001b[0m\u001b[33m in\u001b[0m\u001b[33m quantum\u001b[0m\u001b[33m error\u001b[0m\u001b[33m correction\u001b[0m\u001b[33m,\u001b[0m\u001b[33m which\u001b[0m\u001b[33m is\u001b[0m\u001b[33m essential\u001b[0m\u001b[33m for\u001b[0m\u001b[33m large\u001b[0m\u001b[33m-scale\u001b[0m\u001b[33m quantum\u001b[0m\u001b[33m computing\u001b[0m\u001b[33m (\u001b[0m\u001b[33mG\u001b[0m\u001b[33mottes\u001b[0m\u001b[33mman\u001b[0m\u001b[33m,\u001b[0m\u001b[33m \u001b[0m\u001b[33m202\u001b[0m\u001b[33m2\u001b[0m\u001b[33m).\n",
|
||||
"\n",
|
||||
"\u001b[0m\u001b[33mS\u001b[0m\u001b[33mOURCES\u001b[0m\u001b[33m:\n",
|
||||
"\u001b[0m\u001b[33m-\u001b[0m\u001b[33m IBM\u001b[0m\u001b[33m Quantum\u001b[0m\u001b[33m:\u001b[0m\u001b[33m \"\u001b[0m\u001b[33mQuant\u001b[0m\u001b[33mum\u001b[0m\u001b[33m Process\u001b[0m\u001b[33mors\u001b[0m\u001b[33m\"\u001b[0m\u001b[33m (\u001b[0m\u001b[33mhttps\u001b[0m\u001b[33m://\u001b[0m\u001b[33mwww\u001b[0m\u001b[33m.ibm\u001b[0m\u001b[33m.com\u001b[0m\u001b[33m/\u001b[0m\u001b[33mquant\u001b[0m\u001b[33mum\u001b[0m\u001b[33m/com\u001b[0m\u001b[33mputer\u001b[0m\u001b[33m/)\n",
|
||||
"\u001b[0m\u001b[33m-\u001b[0m\u001b[33m Google\u001b[0m\u001b[33m Quantum\u001b[0m\u001b[33m AI\u001b[0m\u001b[33m Lab\u001b[0m\u001b[33m:\u001b[0m\u001b[33m \"\u001b[0m\u001b[33mQuant\u001b[0m\u001b[33mum\u001b[0m\u001b[33m Process\u001b[0m\u001b[33mors\u001b[0m\u001b[33m\"\u001b[0m\u001b[33m (\u001b[0m\u001b[33mhttps\u001b[0m\u001b[33m://\u001b[0m\u001b[33mquant\u001b[0m\u001b[33mum\u001b[0m\u001b[33mai\u001b[0m\u001b[33m.google\u001b[0m\u001b[33m/al\u001b[0m\u001b[33mphabet\u001b[0m\u001b[33m/sub\u001b[0m\u001b[33m-page\u001b[0m\u001b[33m-\u001b[0m\u001b[33m1\u001b[0m\u001b[33m/)\n",
|
||||
"\u001b[0m\u001b[33m-\u001b[0m\u001b[33m Bh\u001b[0m\u001b[33marti\u001b[0m\u001b[33m,\u001b[0m\u001b[33m K\u001b[0m\u001b[33m.\u001b[0m\u001b[33m (\u001b[0m\u001b[33m202\u001b[0m\u001b[33m2\u001b[0m\u001b[33m).\u001b[0m\u001b[33m \"\u001b[0m\u001b[33mQuant\u001b[0m\u001b[33mum\u001b[0m\u001b[33m Computing\u001b[0m\u001b[33m:\u001b[0m\u001b[33m A\u001b[0m\u001b[33m Review\u001b[0m\u001b[33m of\u001b[0m\u001b[33m Recent\u001b[0m\u001b[33m Advances\u001b[0m\u001b[33m.\"\u001b[0m\u001b[33m Journal\u001b[0m\u001b[33m of\u001b[0m\u001b[33m Physics\u001b[0m\u001b[33m:\u001b[0m\u001b[33m Conference\u001b[0m\u001b[33m Series\u001b[0m\u001b[33m,\u001b[0m\u001b[33m \u001b[0m\u001b[33m218\u001b[0m\u001b[33m5\u001b[0m\u001b[33m(\u001b[0m\u001b[33m1\u001b[0m\u001b[33m),\u001b[0m\u001b[33m \u001b[0m\u001b[33m012\u001b[0m\u001b[33m001\u001b[0m\u001b[33m.\u001b[0m\u001b[33m (\u001b[0m\u001b[33mhttps\u001b[0m\u001b[33m://\u001b[0m\u001b[33mi\u001b[0m\u001b[33mop\u001b[0m\u001b[33mscience\u001b[0m\u001b[33m.i\u001b[0m\u001b[33mop\u001b[0m\u001b[33m.org\u001b[0m\u001b[33m/article\u001b[0m\u001b[33m/\u001b[0m\u001b[33m10\u001b[0m\u001b[33m.\u001b[0m\u001b[33m108\u001b[0m\u001b[33m8\u001b[0m\u001b[33m/\u001b[0m\u001b[33m174\u001b[0m\u001b[33m2\u001b[0m\u001b[33m-\u001b[0m\u001b[33m659\u001b[0m\u001b[33m6\u001b[0m\u001b[33m/\u001b[0m\u001b[33m218\u001b[0m\u001b[33m5\u001b[0m\u001b[33m/\u001b[0m\u001b[33m1\u001b[0m\u001b[33m/\u001b[0m\u001b[33m012\u001b[0m\u001b[33m001\u001b[0m\u001b[33m)\n",
|
||||
"\u001b[0m\u001b[33m-\u001b[0m\u001b[33m Zhang\u001b[0m\u001b[33m,\u001b[0m\u001b[33m Y\u001b[0m\u001b[33m.\u001b[0m\u001b[33m (\u001b[0m\u001b[33m202\u001b[0m\u001b[33m2\u001b[0m\u001b[33m).\u001b[0m\u001b[33m \"\u001b[0m\u001b[33mQuant\u001b[0m\u001b[33mum\u001b[0m\u001b[33m Algorithms\u001b[0m\u001b[33m for\u001b[0m\u001b[33m Machine\u001b[0m\u001b[33m Learning\u001b[0m\u001b[33m.\"\u001b[0m\u001b[33m Journal\u001b[0m\u001b[33m of\u001b[0m\u001b[33m Machine\u001b[0m\u001b[33m Learning\u001b[0m\u001b[33m Research\u001b[0m\u001b[33m,\u001b[0m\u001b[33m \u001b[0m\u001b[33m23\u001b[0m\u001b[33m,\u001b[0m\u001b[33m \u001b[0m\u001b[33m1\u001b[0m\u001b[33m-\u001b[0m\u001b[33m36\u001b[0m\u001b[33m.\u001b[0m\u001b[33m (\u001b[0m\u001b[33mhttps\u001b[0m\u001b[33m://\u001b[0m\u001b[33mj\u001b[0m\u001b[33mml\u001b[0m\u001b[33mr\u001b[0m\u001b[33m.org\u001b[0m\u001b[33m/p\u001b[0m\u001b[33mapers\u001b[0m\u001b[33m/v\u001b[0m\u001b[33m23\u001b[0m\u001b[33m/\u001b[0m\u001b[33m20\u001b[0m\u001b[33m-\u001b[0m\u001b[33m065\u001b[0m\u001b[33m.html\u001b[0m\u001b[33m)\n",
|
||||
"\u001b[0m\u001b[33m-\u001b[0m\u001b[33m G\u001b[0m\u001b[33mottes\u001b[0m\u001b[33mman\u001b[0m\u001b[33m,\u001b[0m\u001b[33m D\u001b[0m\u001b[33m.\u001b[0m\u001b[33m (\u001b[0m\u001b[33m202\u001b[0m\u001b[33m2\u001b[0m\u001b[33m).\u001b[0m\u001b[33m \"\u001b[0m\u001b[33mQuant\u001b[0m\u001b[33mum\u001b[0m\u001b[33m Error\u001b[0m\u001b[33m Correction\u001b[0m\u001b[33m.\"\u001b[0m\u001b[33m In\u001b[0m\u001b[33m Encyclopedia\u001b[0m\u001b[33m of\u001b[0m\u001b[33m Complexity\u001b[0m\u001b[33m and\u001b[0m\u001b[33m Systems\u001b[0m\u001b[33m Science\u001b[0m\u001b[33m (\u001b[0m\u001b[33mpp\u001b[0m\u001b[33m.\u001b[0m\u001b[33m \u001b[0m\u001b[33m1\u001b[0m\u001b[33m-\u001b[0m\u001b[33m13\u001b[0m\u001b[33m).\u001b[0m\u001b[33m Springer\u001b[0m\u001b[33m,\u001b[0m\u001b[33m New\u001b[0m\u001b[33m York\u001b[0m\u001b[33m,\u001b[0m\u001b[33m NY\u001b[0m\u001b[33m.\u001b[0m\u001b[33m (\u001b[0m\u001b[33mhttps\u001b[0m\u001b[33m://\u001b[0m\u001b[33mlink\u001b[0m\u001b[33m.spring\u001b[0m\u001b[33mer\u001b[0m\u001b[33m.com\u001b[0m\u001b[33m/reference\u001b[0m\u001b[33mwork\u001b[0m\u001b[33mentry\u001b[0m\u001b[33m/\u001b[0m\u001b[33m10\u001b[0m\u001b[33m.\u001b[0m\u001b[33m100\u001b[0m\u001b[33m7\u001b[0m\u001b[33m/\u001b[0m\u001b[33m978\u001b[0m\u001b[33m-\u001b[0m\u001b[33m0\u001b[0m\u001b[33m-\u001b[0m\u001b[33m387\u001b[0m\u001b[33m-\u001b[0m\u001b[33m758\u001b[0m\u001b[33m88\u001b[0m\u001b[33m-\u001b[0m\u001b[33m6\u001b[0m\u001b[33m_\u001b[0m\u001b[33m447\u001b[0m\u001b[33m)\u001b[0m\u001b[97m\u001b[0m\n",
|
||||
"\u001b[30m\u001b[0m"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"async def create_search_agent(client: LlamaStackClient) -> Agent:\n",
|
||||
" \"\"\"Create an agent with Brave Search capability.\"\"\"\n",
|
||||
" search_tool = AgentConfigToolSearchToolDefinition(\n",
|
||||
" type=\"brave_search\",\n",
|
||||
" engine=\"brave\",\n",
|
||||
" api_key=\"dummy_value\"#os.getenv(\"BRAVE_SEARCH_API_KEY\"),\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" return await create_tool_agent(\n",
|
||||
" client=client,\n",
|
||||
" tools=[search_tool],\n",
|
||||
" instructions=\"\"\"\n",
|
||||
" You are a research assistant that can search the web.\n",
|
||||
" Always cite your sources with URLs when providing information.\n",
|
||||
" Format your responses as:\n",
|
||||
"\n",
|
||||
" FINDINGS:\n",
|
||||
" [Your summary here]\n",
|
||||
"\n",
|
||||
" SOURCES:\n",
|
||||
" - [Source title](URL)\n",
|
||||
" \"\"\"\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
"# Example usage\n",
|
||||
"async def search_example():\n",
|
||||
" client = LlamaStackClient(base_url=\"http://localhost:5001\")\n",
|
||||
" agent = await create_search_agent(client)\n",
|
||||
"\n",
|
||||
" # Create a session\n",
|
||||
" session_id = agent.create_session(\"search-session\")\n",
|
||||
"\n",
|
||||
" # Example queries\n",
|
||||
" queries = [\n",
|
||||
" \"What are the latest developments in quantum computing?\",\n",
|
||||
" #\"Who won the most recent Super Bowl?\",\n",
|
||||
" ]\n",
|
||||
"\n",
|
||||
" for query in queries:\n",
|
||||
" print(f\"\\nQuery: {query}\")\n",
|
||||
" print(\"-\" * 50)\n",
|
||||
"\n",
|
||||
" response = agent.create_turn(\n",
|
||||
" messages=[{\"role\": \"user\", \"content\": query}],\n",
|
||||
" session_id=session_id,\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" async for log in EventLogger().log(response):\n",
|
||||
" log.print()\n",
|
||||
"\n",
|
||||
"# Run the example (in Jupyter, use asyncio.run())\n",
|
||||
"await search_example()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3. Custom Tool Creation\n",
|
||||
"\n",
|
||||
"Let's create a custom weather tool:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 27,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\n",
|
||||
"Query: What's the weather like in San Francisco?\n",
|
||||
"--------------------------------------------------\n",
|
||||
"\u001b[30m\u001b[0m\u001b[33minference> \u001b[0m\u001b[33m{\n",
|
||||
"\u001b[0m\u001b[33m \u001b[0m\u001b[33m \"\u001b[0m\u001b[33mtype\u001b[0m\u001b[33m\":\u001b[0m\u001b[33m \"\u001b[0m\u001b[33mfunction\u001b[0m\u001b[33m\",\n",
|
||||
"\u001b[0m\u001b[33m \u001b[0m\u001b[33m \"\u001b[0m\u001b[33mname\u001b[0m\u001b[33m\":\u001b[0m\u001b[33m \"\u001b[0m\u001b[33mget\u001b[0m\u001b[33m_weather\u001b[0m\u001b[33m\",\n",
|
||||
"\u001b[0m\u001b[33m \u001b[0m\u001b[33m \"\u001b[0m\u001b[33mparameters\u001b[0m\u001b[33m\":\u001b[0m\u001b[33m {\n",
|
||||
"\u001b[0m\u001b[33m \u001b[0m\u001b[33m \"\u001b[0m\u001b[33mlocation\u001b[0m\u001b[33m\":\u001b[0m\u001b[33m \"\u001b[0m\u001b[33mSan\u001b[0m\u001b[33m Francisco\u001b[0m\u001b[33m\"\n",
|
||||
"\u001b[0m\u001b[33m \u001b[0m\u001b[33m }\n",
|
||||
"\u001b[0m\u001b[33m}\u001b[0m\u001b[97m\u001b[0m\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"ename": "AttributeError",
|
||||
"evalue": "'WeatherTool' object has no attribute 'run'",
|
||||
"output_type": "error",
|
||||
"traceback": [
|
||||
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
||||
"\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)",
|
||||
"Cell \u001b[0;32mIn[27], line 113\u001b[0m\n\u001b[1;32m 110\u001b[0m nest_asyncio\u001b[38;5;241m.\u001b[39mapply()\n\u001b[1;32m 112\u001b[0m \u001b[38;5;66;03m# Run the example\u001b[39;00m\n\u001b[0;32m--> 113\u001b[0m \u001b[38;5;28;01mawait\u001b[39;00m weather_example()\n",
|
||||
"Cell \u001b[0;32mIn[27], line 105\u001b[0m, in \u001b[0;36mweather_example\u001b[0;34m()\u001b[0m\n\u001b[1;32m 98\u001b[0m \u001b[38;5;28mprint\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124m-\u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;241m*\u001b[39m \u001b[38;5;241m50\u001b[39m)\n\u001b[1;32m 100\u001b[0m response \u001b[38;5;241m=\u001b[39m agent\u001b[38;5;241m.\u001b[39mcreate_turn(\n\u001b[1;32m 101\u001b[0m messages\u001b[38;5;241m=\u001b[39m[{\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mrole\u001b[39m\u001b[38;5;124m\"\u001b[39m: \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124muser\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mcontent\u001b[39m\u001b[38;5;124m\"\u001b[39m: query}],\n\u001b[1;32m 102\u001b[0m session_id\u001b[38;5;241m=\u001b[39msession_id,\n\u001b[1;32m 103\u001b[0m )\n\u001b[0;32m--> 105\u001b[0m \u001b[38;5;28;01masync\u001b[39;00m \u001b[38;5;28;01mfor\u001b[39;00m log \u001b[38;5;129;01min\u001b[39;00m EventLogger()\u001b[38;5;241m.\u001b[39mlog(response):\n\u001b[1;32m 106\u001b[0m log\u001b[38;5;241m.\u001b[39mprint()\n",
|
||||
"File \u001b[0;32m~/new_task/llama-stack-client-python/src/llama_stack_client/lib/agents/event_logger.py:55\u001b[0m, in \u001b[0;36mEventLogger.log\u001b[0;34m(self, event_generator)\u001b[0m\n\u001b[1;32m 52\u001b[0m previous_event_type \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[1;32m 53\u001b[0m previous_step_type \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m\n\u001b[0;32m---> 55\u001b[0m \u001b[38;5;28;01masync\u001b[39;00m \u001b[38;5;28;01mfor\u001b[39;00m chunk \u001b[38;5;129;01min\u001b[39;00m event_generator:\n\u001b[1;32m 56\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28mhasattr\u001b[39m(chunk, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mevent\u001b[39m\u001b[38;5;124m\"\u001b[39m):\n\u001b[1;32m 57\u001b[0m \u001b[38;5;66;03m# Need to check for custom tool first\u001b[39;00m\n\u001b[1;32m 58\u001b[0m \u001b[38;5;66;03m# since it does not produce event but instead\u001b[39;00m\n\u001b[1;32m 59\u001b[0m \u001b[38;5;66;03m# a Message\u001b[39;00m\n\u001b[1;32m 60\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(chunk, ToolResponseMessage):\n",
|
||||
"File \u001b[0;32m~/new_task/llama-stack-client-python/src/llama_stack_client/lib/agents/agent.py:76\u001b[0m, in \u001b[0;36mAgent.create_turn\u001b[0;34m(self, messages, attachments, session_id)\u001b[0m\n\u001b[1;32m 74\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 75\u001b[0m tool \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mcustom_tools[tool_call\u001b[38;5;241m.\u001b[39mtool_name]\n\u001b[0;32m---> 76\u001b[0m result_messages \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mexecute_custom_tool(tool, message)\n\u001b[1;32m 77\u001b[0m next_message \u001b[38;5;241m=\u001b[39m result_messages[\u001b[38;5;241m0\u001b[39m]\n\u001b[1;32m 79\u001b[0m \u001b[38;5;28;01myield\u001b[39;00m next_message\n",
|
||||
"File \u001b[0;32m~/new_task/llama-stack-client-python/src/llama_stack_client/lib/agents/agent.py:84\u001b[0m, in \u001b[0;36mAgent.execute_custom_tool\u001b[0;34m(self, tool, message)\u001b[0m\n\u001b[1;32m 81\u001b[0m \u001b[38;5;28;01masync\u001b[39;00m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mexecute_custom_tool\u001b[39m(\n\u001b[1;32m 82\u001b[0m \u001b[38;5;28mself\u001b[39m, tool: CustomTool, message: Union[UserMessage, ToolResponseMessage]\n\u001b[1;32m 83\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m List[Union[UserMessage, ToolResponseMessage]]:\n\u001b[0;32m---> 84\u001b[0m result_messages \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mawait\u001b[39;00m \u001b[43mtool\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrun\u001b[49m([message])\n\u001b[1;32m 85\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m result_messages\n",
|
||||
"\u001b[0;31mAttributeError\u001b[0m: 'WeatherTool' object has no attribute 'run'"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from typing import TypedDict, Optional, Dict, Any\n",
|
||||
"from datetime import datetime\n",
|
||||
"class WeatherTool:\n",
|
||||
" \"\"\"Example custom tool for weather information.\"\"\"\n",
|
||||
" \n",
|
||||
" def get_name(self) -> str:\n",
|
||||
" return \"get_weather\"\n",
|
||||
" \n",
|
||||
" def get_description(self) -> str:\n",
|
||||
" return \"Get weather information for a location\"\n",
|
||||
" \n",
|
||||
" def get_params_definition(self) -> Dict[str, ToolParamDefinitionParam]:\n",
|
||||
" return {\n",
|
||||
" \"location\": ToolParamDefinitionParam(\n",
|
||||
" param_type=\"str\",\n",
|
||||
" description=\"City or location name\",\n",
|
||||
" required=True\n",
|
||||
" ),\n",
|
||||
" \"date\": ToolParamDefinitionParam(\n",
|
||||
" param_type=\"str\",\n",
|
||||
" description=\"Optional date (YYYY-MM-DD)\",\n",
|
||||
" required=False\n",
|
||||
" )\n",
|
||||
" }\n",
|
||||
" \n",
|
||||
" async def run_impl(self, location: str, date: Optional[str] = None) -> Dict[str, Any]:\n",
|
||||
" \"\"\"Simulate getting weather data (replace with actual API call).\"\"\"\n",
|
||||
" # Mock implementation\n",
|
||||
" return {\n",
|
||||
" \"temperature\": 72.5,\n",
|
||||
" \"conditions\": \"partly cloudy\",\n",
|
||||
" \"humidity\": 65.0\n",
|
||||
" }\n",
|
||||
"\n",
|
||||
"async def create_weather_agent(client: LlamaStackClient) -> Agent:\n",
|
||||
" \"\"\"Create an agent with weather tool capability.\"\"\"\n",
|
||||
" agent_config = AgentConfig(\n",
|
||||
" model=\"Llama3.1-8B-Instruct\",\n",
|
||||
" instructions=\"\"\"\n",
|
||||
" You are a weather assistant that can provide weather information.\n",
|
||||
" Always specify the location clearly in your responses.\n",
|
||||
" Include both temperature and conditions in your summaries.\n",
|
||||
" \"\"\",\n",
|
||||
" sampling_params={\n",
|
||||
" \"strategy\": \"greedy\",\n",
|
||||
" \"temperature\": 1.0,\n",
|
||||
" \"top_p\": 0.9,\n",
|
||||
" },\n",
|
||||
" tools=[\n",
|
||||
" {\n",
|
||||
" \"function_name\": \"get_weather\",\n",
|
||||
" \"description\": \"Get weather information for a location\",\n",
|
||||
" \"parameters\": {\n",
|
||||
" \"location\": {\n",
|
||||
" \"param_type\": \"str\",\n",
|
||||
" \"description\": \"City or location name\",\n",
|
||||
" \"required\": True,\n",
|
||||
" },\n",
|
||||
" \"date\": {\n",
|
||||
" \"param_type\": \"str\",\n",
|
||||
" \"description\": \"Optional date (YYYY-MM-DD)\",\n",
|
||||
" \"required\": False,\n",
|
||||
" },\n",
|
||||
" },\n",
|
||||
" \"type\": \"function_call\",\n",
|
||||
" }\n",
|
||||
" ],\n",
|
||||
" tool_choice=\"auto\",\n",
|
||||
" tool_prompt_format=\"json\",\n",
|
||||
" input_shields=[],\n",
|
||||
" output_shields=[],\n",
|
||||
" enable_session_persistence=True\n",
|
||||
" )\n",
|
||||
" \n",
|
||||
" # Create the agent with the tool\n",
|
||||
" weather_tool = WeatherTool()\n",
|
||||
" agent = Agent(\n",
|
||||
" client=client,\n",
|
||||
" agent_config=agent_config,\n",
|
||||
" custom_tools=[weather_tool]\n",
|
||||
" )\n",
|
||||
" \n",
|
||||
" return agent\n",
|
||||
"\n",
|
||||
"# Example usage\n",
|
||||
"async def weather_example():\n",
|
||||
" client = LlamaStackClient(base_url=\"http://localhost:5001\")\n",
|
||||
" agent = await create_weather_agent(client)\n",
|
||||
" session_id = agent.create_session(\"weather-session\")\n",
|
||||
" \n",
|
||||
" queries = [\n",
|
||||
" \"What's the weather like in San Francisco?\",\n",
|
||||
" \"Tell me the weather in Tokyo tomorrow\",\n",
|
||||
" ]\n",
|
||||
" \n",
|
||||
" for query in queries:\n",
|
||||
" print(f\"\\nQuery: {query}\")\n",
|
||||
" print(\"-\" * 50)\n",
|
||||
" \n",
|
||||
" response = agent.create_turn(\n",
|
||||
" messages=[{\"role\": \"user\", \"content\": query}],\n",
|
||||
" session_id=session_id,\n",
|
||||
" )\n",
|
||||
" \n",
|
||||
" async for log in EventLogger().log(response):\n",
|
||||
" log.print()\n",
|
||||
"\n",
|
||||
"# For Jupyter notebooks\n",
|
||||
"import nest_asyncio\n",
|
||||
"nest_asyncio.apply()\n",
|
||||
"\n",
|
||||
"# Run the example\n",
|
||||
"await weather_example()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.13.0"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 4
|
||||
}
|
558
docs/zero_to_hero_guide/Tool_Calling101.ipynb
Normal file
558
docs/zero_to_hero_guide/Tool_Calling101.ipynb
Normal file
|
@ -0,0 +1,558 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Getting Started with LlamaStack: Tool Calling Tutorial\n",
|
||||
"\n",
|
||||
"Welcome! This notebook will guide you through creating and using custom tools with LlamaStack.\n",
|
||||
"We'll start with the basics and work our way up to more complex examples.\n",
|
||||
"\n",
|
||||
"Table of Contents:\n",
|
||||
"1. Setup and Installation\n",
|
||||
"2. Understanding Tool Basics\n",
|
||||
"3. Creating Your First Tool\n",
|
||||
"4. Building a Mock Weather Tool\n",
|
||||
"5. Setting Up the LlamaStack Agent\n",
|
||||
"6. Running Examples\n",
|
||||
"7. Next Steps\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 1. Setup\n",
|
||||
"#### Before we begin, let's import all the required packages:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"import asyncio\n",
|
||||
"import json\n",
|
||||
"from typing import Dict\n",
|
||||
"from datetime import datetime"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# LlamaStack specific imports\n",
|
||||
"from llama_stack_client import LlamaStackClient\n",
|
||||
"from llama_stack_client.lib.agents.agent import Agent\n",
|
||||
"from llama_stack_client.lib.agents.event_logger import EventLogger\n",
|
||||
"from llama_stack_client.types.agent_create_params import AgentConfig\n",
|
||||
"from llama_stack_client.types.tool_param_definition_param import ToolParamDefinitionParam"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 2. Understanding Tool Basics\n",
|
||||
"\n",
|
||||
"In LlamaStack, a tool is like a special function that our AI assistant can use. Think of it as giving the AI a new \n",
|
||||
"capability, like using a calculator or checking the weather.\n",
|
||||
"\n",
|
||||
"Every tool needs:\n",
|
||||
"- A name: What we call the tool\n",
|
||||
"- A description: What the tool does\n",
|
||||
"- Parameters: What information the tool needs to work\n",
|
||||
"- Implementation: The actual code that does the work\n",
|
||||
"\n",
|
||||
"Let's create a base class that all our tools will inherit from:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 26,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"class SingleMessageCustomTool:\n",
|
||||
" \"\"\"Base class for all our custom tools\"\"\"\n",
|
||||
" \n",
|
||||
" async def run(self, messages=None):\n",
|
||||
" \"\"\"\n",
|
||||
" Main entry point for running the tool\n",
|
||||
" Args:\n",
|
||||
" messages: List of messages (can be None for backward compatibility)\n",
|
||||
" \"\"\"\n",
|
||||
" if messages and len(messages) > 0:\n",
|
||||
" # Extract parameters from the message if it contains function parameters\n",
|
||||
" message = messages[0]\n",
|
||||
" if hasattr(message, 'function_parameters'):\n",
|
||||
" return await self.run_impl(**message.function_parameters)\n",
|
||||
" else:\n",
|
||||
" return await self.run_impl()\n",
|
||||
" return await self.run_impl()\n",
|
||||
" \n",
|
||||
" async def run_impl(self, **kwargs):\n",
|
||||
" \"\"\"Each tool will implement this method with their specific logic\"\"\"\n",
|
||||
" raise NotImplementedError()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 3. Creating Your First Tool: Calculator\n",
|
||||
" \n",
|
||||
"Let's create a simple calculator tool. This will help us understand the basic structure of a tool.\n",
|
||||
"Our calculator can:\n",
|
||||
"- Add\n",
|
||||
"- Subtract\n",
|
||||
"- Multiply\n",
|
||||
"- Divide\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 27,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Calculator Tool implementation\n",
|
||||
"class CalculatorTool(SingleMessageCustomTool):\n",
|
||||
" \"\"\"A simple calculator tool that can perform basic math operations\"\"\"\n",
|
||||
" \n",
|
||||
" def get_name(self) -> str:\n",
|
||||
" return \"calculator\"\n",
|
||||
" \n",
|
||||
" def get_description(self) -> str:\n",
|
||||
" return \"Perform basic arithmetic operations (add, subtract, multiply, divide)\"\n",
|
||||
" \n",
|
||||
" def get_params_definition(self) -> Dict[str, ToolParamDefinitionParam]:\n",
|
||||
" return {\n",
|
||||
" \"operation\": ToolParamDefinitionParam(\n",
|
||||
" param_type=\"str\",\n",
|
||||
" description=\"Operation to perform (add, subtract, multiply, divide)\",\n",
|
||||
" required=True\n",
|
||||
" ),\n",
|
||||
" \"x\": ToolParamDefinitionParam(\n",
|
||||
" param_type=\"float\",\n",
|
||||
" description=\"First number\",\n",
|
||||
" required=True\n",
|
||||
" ),\n",
|
||||
" \"y\": ToolParamDefinitionParam(\n",
|
||||
" param_type=\"float\",\n",
|
||||
" description=\"Second number\",\n",
|
||||
" required=True\n",
|
||||
" )\n",
|
||||
" }\n",
|
||||
" \n",
|
||||
" async def run_impl(self, operation: str = None, x: float = None, y: float = None):\n",
|
||||
" \"\"\"The actual implementation of our calculator\"\"\"\n",
|
||||
" if not all([operation, x, y]):\n",
|
||||
" return json.dumps({\"error\": \"Missing required parameters\"})\n",
|
||||
" \n",
|
||||
" # Dictionary of math operations\n",
|
||||
" operations = {\n",
|
||||
" \"add\": lambda a, b: a + b,\n",
|
||||
" \"subtract\": lambda a, b: a - b,\n",
|
||||
" \"multiply\": lambda a, b: a * b,\n",
|
||||
" \"divide\": lambda a, b: a / b if b != 0 else \"Error: Division by zero\"\n",
|
||||
" }\n",
|
||||
" \n",
|
||||
" # Check if the operation is valid\n",
|
||||
" if operation not in operations:\n",
|
||||
" return json.dumps({\"error\": f\"Unknown operation '{operation}'\"})\n",
|
||||
" \n",
|
||||
" try:\n",
|
||||
" # Convert string inputs to float if needed\n",
|
||||
" x = float(x) if isinstance(x, str) else x\n",
|
||||
" y = float(y) if isinstance(y, str) else y\n",
|
||||
" \n",
|
||||
" # Perform the calculation\n",
|
||||
" result = operations[operation](x, y)\n",
|
||||
" return json.dumps({\"result\": result})\n",
|
||||
" except ValueError:\n",
|
||||
" return json.dumps({\"error\": \"Invalid number format\"})\n",
|
||||
" except Exception as e:\n",
|
||||
" return json.dumps({\"error\": str(e)})"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 4. Building a Mock Weather Tool\n",
|
||||
" \n",
|
||||
"Now let's create something a bit more complex: a weather tool! \n",
|
||||
"While this is just a mock version (it doesn't actually fetch real weather data),\n",
|
||||
"it shows how you might structure a tool that interfaces with an external API."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 28,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"class WeatherTool(SingleMessageCustomTool):\n",
|
||||
" \"async def run_single_query(agent, session_id, query: str):\n",
|
||||
" \"\"\"Run a single query through our agent with complete interaction cycle\"\"\"\n",
|
||||
" print(\"\\n\" + \"=\"*50)\n",
|
||||
" print(f\"🤔 User asks: {query}\")\n",
|
||||
" print(\"=\"*50)\n",
|
||||
" \n",
|
||||
" # Get the initial response and tool call\n",
|
||||
" response = agent.create_turn(\n",
|
||||
" messages=[\n",
|
||||
" {\n",
|
||||
" \"role\": \"user\",\n",
|
||||
" \"content\": query,\n",
|
||||
" }\n",
|
||||
" ],\n",
|
||||
" session_id=session_id,\n",
|
||||
" )\n",
|
||||
" \n",
|
||||
" # Process all events including tool calls and final response\n",
|
||||
" async for event in EventLogger().log(response):\n",
|
||||
" event.print()\n",
|
||||
" \n",
|
||||
" # If this was a tool call, we need to create another turn with the result\n",
|
||||
" if hasattr(event, 'tool_calls') and event.tool_calls:\n",
|
||||
" tool_call = event.tool_calls[0] # Get the first tool call\n",
|
||||
" \n",
|
||||
" # Execute the custom tool\n",
|
||||
" if tool_call.tool_name in [t.get_name() for t in agent.custom_tools]:\n",
|
||||
" tool = [t for t in agent.custom_tools if t.get_name() == tool_call.tool_name][0]\n",
|
||||
" result = await tool.run_impl(**tool_call.arguments)\n",
|
||||
" \n",
|
||||
" # Create a follow-up turn with the tool result\n",
|
||||
" follow_up = agent.create_turn(\n",
|
||||
" messages=[\n",
|
||||
" {\n",
|
||||
" \"role\": \"tool\",\n",
|
||||
" \"content\": result,\n",
|
||||
" \"tool_call_id\": tool_call.call_id,\n",
|
||||
" \"name\": tool_call.tool_name\n",
|
||||
" }\n",
|
||||
" ],\n",
|
||||
" session_id=session_id,\n",
|
||||
" )\n",
|
||||
" \n",
|
||||
" # Process the follow-up response\n",
|
||||
" async for follow_up_event in EventLogger().log(follow_up):\n",
|
||||
" follow_up_event.print()\"\"A mock weather tool that simulates getting weather data\"\"\"\n",
|
||||
" \n",
|
||||
" def get_name(self) -> str:\n",
|
||||
" return \"get_weather\"\n",
|
||||
" \n",
|
||||
" def get_description(self) -> str:\n",
|
||||
" return \"Get current weather information for major cities\"\n",
|
||||
" \n",
|
||||
" def get_params_definition(self) -> Dict[str, ToolParamDefinitionParam]:\n",
|
||||
" return {\n",
|
||||
" \"city\": ToolParamDefinitionParam(\n",
|
||||
" param_type=\"str\",\n",
|
||||
" description=\"Name of the city (e.g., New York, London, Tokyo)\",\n",
|
||||
" required=True\n",
|
||||
" ),\n",
|
||||
" \"date\": ToolParamDefinitionParam(\n",
|
||||
" param_type=\"str\",\n",
|
||||
" description=\"Date in YYYY-MM-DD format (optional)\",\n",
|
||||
" required=False\n",
|
||||
" )\n",
|
||||
" }\n",
|
||||
" \n",
|
||||
" async def run_impl(self, city: str = None, date: str = None):\n",
|
||||
" if not city:\n",
|
||||
" return json.dumps({\"error\": \"City parameter is required\"})\n",
|
||||
" \n",
|
||||
" # Mock database of weather information\n",
|
||||
" weather_data = {\n",
|
||||
" \"New York\": {\"temp\": 20, \"condition\": \"sunny\"},\n",
|
||||
" \"London\": {\"temp\": 15, \"condition\": \"rainy\"},\n",
|
||||
" \"Tokyo\": {\"temp\": 25, \"condition\": \"cloudy\"}\n",
|
||||
" }\n",
|
||||
" \n",
|
||||
" try:\n",
|
||||
" # Check if we have data for the requested city\n",
|
||||
" if city not in weather_data:\n",
|
||||
" return json.dumps({\n",
|
||||
" \"error\": f\"Sorry! No data available for {city}\",\n",
|
||||
" \"available_cities\": list(weather_data.keys())\n",
|
||||
" })\n",
|
||||
" \n",
|
||||
" # Return the weather information\n",
|
||||
" return json.dumps({\n",
|
||||
" \"city\": city,\n",
|
||||
" \"date\": date or datetime.now().strftime(\"%Y-%m-%d\"),\n",
|
||||
" \"data\": weather_data[city]\n",
|
||||
" })\n",
|
||||
" except Exception as e:\n",
|
||||
" return json.dumps({\"error\": str(e)})"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 29,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# ## 5. Setting Up the LlamaStack Agent\n",
|
||||
"# \n",
|
||||
"# Now that we have our tools, we need to create an agent that can use them.\n",
|
||||
"# The agent is like a smart assistant that knows how to use our tools when needed."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 30,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"async def setup_agent(host: str = \"localhost\", port: int = 5001):\n",
|
||||
" \"\"\"Creates and configures our LlamaStack agent\"\"\"\n",
|
||||
" \n",
|
||||
" # Create a client to connect to the LlamaStack server\n",
|
||||
" client = LlamaStackClient(\n",
|
||||
" base_url=f\"http://{host}:{port}\",\n",
|
||||
" )\n",
|
||||
" \n",
|
||||
" # Configure how we want our agent to behave\n",
|
||||
" agent_config = AgentConfig(\n",
|
||||
" model=\"Llama3.1-8B-Instruct\",\n",
|
||||
" instructions=\"\"\"You are a helpful assistant that can:\n",
|
||||
" 1. Perform mathematical calculations\n",
|
||||
" 2. Check weather information\n",
|
||||
" Always explain your thinking before using a tool.\"\"\",\n",
|
||||
" \n",
|
||||
" sampling_params={\n",
|
||||
" \"strategy\": \"greedy\",\n",
|
||||
" \"temperature\": 1.0,\n",
|
||||
" \"top_p\": 0.9,\n",
|
||||
" },\n",
|
||||
" \n",
|
||||
" # List of tools available to the agent\n",
|
||||
" tools=[\n",
|
||||
" {\n",
|
||||
" \"function_name\": \"calculator\",\n",
|
||||
" \"description\": \"Perform basic arithmetic operations\",\n",
|
||||
" \"parameters\": {\n",
|
||||
" \"operation\": {\n",
|
||||
" \"param_type\": \"str\",\n",
|
||||
" \"description\": \"Operation to perform (add, subtract, multiply, divide)\",\n",
|
||||
" \"required\": True,\n",
|
||||
" },\n",
|
||||
" \"x\": {\n",
|
||||
" \"param_type\": \"float\",\n",
|
||||
" \"description\": \"First number\",\n",
|
||||
" \"required\": True,\n",
|
||||
" },\n",
|
||||
" \"y\": {\n",
|
||||
" \"param_type\": \"float\",\n",
|
||||
" \"description\": \"Second number\",\n",
|
||||
" \"required\": True,\n",
|
||||
" },\n",
|
||||
" },\n",
|
||||
" \"type\": \"function_call\",\n",
|
||||
" },\n",
|
||||
" {\n",
|
||||
" \"function_name\": \"get_weather\",\n",
|
||||
" \"description\": \"Get weather information for a given city\",\n",
|
||||
" \"parameters\": {\n",
|
||||
" \"city\": {\n",
|
||||
" \"param_type\": \"str\",\n",
|
||||
" \"description\": \"Name of the city\",\n",
|
||||
" \"required\": True,\n",
|
||||
" },\n",
|
||||
" \"date\": {\n",
|
||||
" \"param_type\": \"str\",\n",
|
||||
" \"description\": \"Date in YYYY-MM-DD format\",\n",
|
||||
" \"required\": False,\n",
|
||||
" },\n",
|
||||
" },\n",
|
||||
" \"type\": \"function_call\",\n",
|
||||
" },\n",
|
||||
" ],\n",
|
||||
" tool_choice=\"auto\",\n",
|
||||
" # Using standard JSON format for tools\n",
|
||||
" tool_prompt_format=\"json\", \n",
|
||||
" input_shields=[],\n",
|
||||
" output_shields=[],\n",
|
||||
" enable_session_persistence=False,\n",
|
||||
" )\n",
|
||||
" \n",
|
||||
" # Create our tools\n",
|
||||
" custom_tools = [CalculatorTool(), WeatherTool()]\n",
|
||||
" \n",
|
||||
" # Create the agent\n",
|
||||
" agent = Agent(client, agent_config, custom_tools)\n",
|
||||
" session_id = agent.create_session(\"tutorial-session\")\n",
|
||||
" print(f\"🎉 Created session_id={session_id} for Agent({agent.agent_id})\")\n",
|
||||
" \n",
|
||||
" return agent, session_id"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 31,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# ## 6. Running Examples\n",
|
||||
"# \n",
|
||||
"# Let's try out our agent with some example questions!\n",
|
||||
"\n",
|
||||
"# %%"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 46,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import nest_asyncio\n",
|
||||
"nest_asyncio.apply() # This allows async operations to work in Jupyter\n",
|
||||
"\n",
|
||||
"# %%\n",
|
||||
"# Initialize the agent\n",
|
||||
"async def init_agent():\n",
|
||||
" \"\"\"Initialize our agent - run this first!\"\"\"\n",
|
||||
" agent, session_id = await setup_agent()\n",
|
||||
" print(f\"✨ Agent initialized with session {session_id}\")\n",
|
||||
" return agent, session_id\n",
|
||||
"\n",
|
||||
"# %%\n",
|
||||
"# Function to run a single query\n",
|
||||
"async def run_single_query(agent, session_id, query: str):\n",
|
||||
" \"\"\"Run a single query through our agent\"\"\"\n",
|
||||
" print(\"\\n\" + \"=\"*50)\n",
|
||||
" print(f\"🤔 User asks: {query}\")\n",
|
||||
" print(\"=\"*50)\n",
|
||||
" \n",
|
||||
" response = agent.create_turn(\n",
|
||||
" messages=[\n",
|
||||
" {\n",
|
||||
" \"role\": \"user\",\n",
|
||||
" \"content\": query,\n",
|
||||
" }\n",
|
||||
" ],\n",
|
||||
" session_id=session_id,\n",
|
||||
" )\n",
|
||||
" \n",
|
||||
" async for log in EventLogger().log(response):\n",
|
||||
" log.print()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Now let's run everything and see it in action!\n",
|
||||
"\n",
|
||||
"Create and run our agent"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 47,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"🎉 Created session_id=fbe83bb6-bdfd-497c-b920-d7307482d8ba for Agent(3997eeda-4ffd-4b05-9026-28b4da206a11)\n",
|
||||
"✨ Agent initialized with session fbe83bb6-bdfd-497c-b920-d7307482d8ba\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"agent, session_id = await init_agent()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 48,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\n",
|
||||
"==================================================\n",
|
||||
"🤔 User asks: What's 25 plus 17?\n",
|
||||
"==================================================\n",
|
||||
"\u001b[30m\u001b[0m\u001b[33minference> \u001b[0m\u001b[36m\u001b[0m\u001b[36m{\"\u001b[0m\u001b[36mtype\u001b[0m\u001b[36m\":\u001b[0m\u001b[36m \"\u001b[0m\u001b[36mfunction\u001b[0m\u001b[36m\",\u001b[0m\u001b[36m \"\u001b[0m\u001b[36mname\u001b[0m\u001b[36m\":\u001b[0m\u001b[36m \"\u001b[0m\u001b[36mcalculator\u001b[0m\u001b[36m\",\u001b[0m\u001b[36m \"\u001b[0m\u001b[36mparameters\u001b[0m\u001b[36m\":\u001b[0m\u001b[36m {\"\u001b[0m\u001b[36moperation\u001b[0m\u001b[36m\":\u001b[0m\u001b[36m \"\u001b[0m\u001b[36madd\u001b[0m\u001b[36m\",\u001b[0m\u001b[36m \"\u001b[0m\u001b[36my\u001b[0m\u001b[36m\":\u001b[0m\u001b[36m \"\u001b[0m\u001b[36m17\u001b[0m\u001b[36m\",\u001b[0m\u001b[36m \"\u001b[0m\u001b[36mx\u001b[0m\u001b[36m\":\u001b[0m\u001b[36m \"\u001b[0m\u001b[36m25\u001b[0m\u001b[36m\"}}\u001b[0m\u001b[97m\u001b[0m\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"await run_single_query(agent, session_id, \"What's 25 plus 17?\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 49,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\n",
|
||||
"==================================================\n",
|
||||
"🤔 User asks: What's the weather like in Tokyo?\n",
|
||||
"==================================================\n",
|
||||
"\u001b[30m\u001b[0m\u001b[33minference> \u001b[0m\u001b[36m\u001b[0m\u001b[36m{\"\u001b[0m\u001b[36mtype\u001b[0m\u001b[36m\":\u001b[0m\u001b[36m \"\u001b[0m\u001b[36mfunction\u001b[0m\u001b[36m\",\u001b[0m\u001b[36m \"\u001b[0m\u001b[36mname\u001b[0m\u001b[36m\":\u001b[0m\u001b[36m \"\u001b[0m\u001b[36mget\u001b[0m\u001b[36m_weather\u001b[0m\u001b[36m\",\u001b[0m\u001b[36m \"\u001b[0m\u001b[36mparameters\u001b[0m\u001b[36m\":\u001b[0m\u001b[36m {\"\u001b[0m\u001b[36mcity\u001b[0m\u001b[36m\":\u001b[0m\u001b[36m \"\u001b[0m\u001b[36mTok\u001b[0m\u001b[36myo\u001b[0m\u001b[36m\"}}\u001b[0m\u001b[97m\u001b[0m\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"await run_single_query(agent, session_id, \"What's the weather like in Tokyo?\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 50,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#fin"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.13.0"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 4
|
||||
}
|
|
@ -1,7 +1,7 @@
|
|||
|
||||
# Llama Stack Text Generation Guide
|
||||
# Llama Stack Inference Guide
|
||||
|
||||
This document provides instructions on how to use Llama Stack's `chat_completion` function for generating text using the `Llama3.2-11B-Vision-Instruct` model. Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/).
|
||||
This document provides instructions on how to use Llama Stack's `chat_completion` function for generating text using the `Llama3.2-11B-Vision-Instruct` model. Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/).
|
||||
|
||||
### Table of Contents
|
||||
1. [Quickstart](#quickstart)
|
|
@ -157,16 +157,15 @@ With these steps, you should have a functional Llama Stack setup capable of gene
|
|||
## Next Steps
|
||||
|
||||
- **Explore Other Guides**: Dive deeper into specific topics by following these guides:
|
||||
- [Understanding Distributions](#)
|
||||
- [Configure your Distro](#)
|
||||
- [Doing Inference API Call and Fetching a Response from Endpoints](#)
|
||||
- [Creating a Conversation Loop](#)
|
||||
- [Sending Image to the Model](#)
|
||||
- [Tool Calling: How to and Details](#)
|
||||
- [Memory API: Show Simple In-Memory Retrieval](#)
|
||||
- [Agents API: Explain Components](#)
|
||||
- [Using Safety API in Conversation](#)
|
||||
- [Prompt Engineering Guide](#)
|
||||
- [Inference 101](00_Inference101.ipynb)
|
||||
- [Simple switch between local and cloud model](00_Local_Cloud_Inference101.ipynb)
|
||||
- [Prompt Engineering](01_Prompt_Engineering101.ipynb)
|
||||
- [Chat with Image - LlamaStack Vision API](02_Image_Chat101.ipynb)
|
||||
- [Tool Calling: How to and Details](03_Tool_Calling101.ipynb)
|
||||
- [Memory API: Show Simple In-Memory Retrieval](04_Memory101.ipynb)
|
||||
- [Using Safety API in Conversation](05_Safety101.ipynb)
|
||||
- [Agents API: Explain Components](06_Agents101.ipynb)
|
||||
|
||||
|
||||
- **Explore Client SDKs**: Utilize our client SDKs for various languages to integrate Llama Stack into your applications:
|
||||
- [Python SDK](https://github.com/meta-llama/llama-stack-client-python)
|
||||
|
@ -180,5 +179,3 @@ With these steps, you should have a functional Llama Stack setup capable of gene
|
|||
|
||||
|
||||
---
|
||||
|
||||
|
Loading…
Add table
Add a link
Reference in a new issue