diff --git a/docs/notebooks/Llama_Stack_Agent_Workflows.ipynb b/docs/notebooks/Llama_Stack_Agent_Workflows.ipynb index a48088501..0b15adaff 100644 --- a/docs/notebooks/Llama_Stack_Agent_Workflows.ipynb +++ b/docs/notebooks/Llama_Stack_Agent_Workflows.ipynb @@ -10,13 +10,15 @@ "\n", "This notebook contains Llama Stack implementations of common and popular agent workflows discussed in Anthropic's blog post [Building Effective Agent Workflows](https://www.anthropic.com/research/building-effective-agents). \n", "\n", - "1. Basic Workflows\n", - " 1.1 Prompt Chaining\n", - " 1.2 Routing\n", - " 1.3 Parallelization\n", - "2. Advanced Workflows\n", - " 2.1 Evaluator-Optimizer\n", - " 2.2 Orchestrator-Workers\n", + "**1. Basic Workflows**\n", + "- 1.1 Prompt Chaining\n", + "- 1.2 Routing\n", + "- 1.3 Parallelization\n", + "\n", + "**2. Advanced Workflows**\n", + "- 2.1 Evaluator-Optimizer\n", + "- 2.2 Orchestrator-Workers\n", + "\n", "\n", "For each workflow type, we present minimal implementations using Llama Stack using task examples from [anthropic-cookbook](https://github.com/anthropics/anthropic-cookbook/tree/main/patterns/agents), and showcase how to monitor the internals within each workflow execution. " ] @@ -30,24 +32,55 @@ }, { "cell_type": "code", - "execution_count": 98, + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# NBVAL_SKIP\n", + "!pip install -U llama-stack\n", + "!UV_SYSTEM_PYTHON=1 llama stack build --template fireworks --image-type venv" + ] + }, + { + "cell_type": "code", + "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from llama_stack_client import LlamaStackClient\n", + "from llama_stack.distribution.library_client import LlamaStackAsLibraryClient\n", "from llama_stack_client.types.agent_create_params import AgentConfig\n", - "from llama_stack_client.lib.agents.react.agent import ReActAgent\n", "from llama_stack_client.lib.agents.agent import Agent\n", "from rich.pretty import pprint\n", "import json\n", "import uuid\n", "from pydantic import BaseModel\n", "import rich\n", + "import os\n", + "\n", + "try:\n", + " from google.colab import userdata\n", + " os.environ['FIREWORKS_API_KEY'] = userdata.get('FIREWORKS_API_KEY')\n", + "except ImportError:\n", + " print(\"Not in Google Colab environment\")\n", + "\n", + "for key in ['FIREWORKS_API_KEY']:\n", + " try:\n", + " api_key = os.environ[key]\n", + " if not api_key:\n", + " raise ValueError(f\"{key} environment variable is empty\")\n", + " except KeyError:\n", + " api_key = input(f\"{key} environment variable is not set. Please enter your API key: \")\n", + " os.environ[key] = api_key\n", + "\n", + "client = LlamaStackAsLibraryClient(\"fireworks\", provider_data = {\"fireworks_api_key\": os.environ['FIREWORKS_API_KEY']})\n", + "_ = client.initialize()\n", + "\n", + "# Uncomment to run on a hosted Llama Stack server\n", + "# client = LlamaStackClient(base_url=\"http://localhost:8321\")\n", "\n", "MODEL_ID = \"meta-llama/Llama-3.3-70B-Instruct\"\n", "\n", - "client = LlamaStackClient(base_url=\"http://localhost:8321\")\n", - "\n", "base_agent_config = AgentConfig(\n", " model=MODEL_ID,\n", " instructions=\"You are a helpful assistant.\",\n", @@ -69,11 +102,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## 1. Basic Workflows\n", - "\n", - "2. **Routing**: Dynamically selects specialized LLM paths based on input characteristics. \n", - "\n", - "3. **Parallelization**: Distributes independent subtasks acorss multiple LLMs for concurrent processing. " + "## 1. Basic Workflows" ] }, { @@ -92,7 +121,7 @@ }, { "cell_type": "code", - "execution_count": 99, + "execution_count": 109, "metadata": {}, "outputs": [ { @@ -116,12 +145,11 @@ "45%: revenue growth\n", "23%: market share\n", "5%: customer churn\n", - "8%: previous customer churn\n", - "78%: product adoption rate\n", "87%: employee satisfaction\n", + "78%: product adoption rate\n", "34%: operating margin\n", - "43: new user acquisition cost \n", - "(Note: new user acquisition cost is in dollars, not a percentage or points, so it remains as is, but in decimal format it would be 43.00, however the original was not in decimal, it was in whole dollar amount)\n", + "8%: previous customer churn\n", + "0.043: new user acquisition cost (as a decimal, assuming $43 is a dollar value and not a percentage)\n", "\n", "\n", "========= Turn: 2 =========\n", @@ -129,11 +157,11 @@ "87%: employee satisfaction\n", "78%: product adoption rate\n", "45%: revenue growth\n", - "43: new user acquisition cost\n", "34%: operating margin\n", "23%: market share\n", "8%: previous customer churn\n", "5%: customer churn\n", + "0.043: new user acquisition cost\n", "\n", "\n", "========= Turn: 3 =========\n", @@ -147,9 +175,7 @@ "| Market Share | 23% |\n", "| Previous Customer Churn | 8% |\n", "| Customer Churn | 5% |\n", - "| New User Acquisition Cost | $43 | \n", - "\n", - "Note: I kept the New User Acquisition Cost as $43, since it's not a percentage value. If you'd like, I can format it as a decimal (43.00) instead. Let me know!\n", + "| New User Acquisition Cost | 0.043 |\n", "\n", "\n" ] @@ -924,8 +950,8 @@ "##### 1.2.2 Monitor Routing Internals\n", "\n", "We can query the internal details about what happened within each agent (routing agent and specialized agents) by using the session id. \n", - "- The routing agent processed all user's request\n", - "- Specialized agent gets user's request based on the routing agent's decision, we can see that `billing` agent never get any user's request. " + "- **Routing agent** processed all user's request\n", + "- **Specialized agent** gets user's request based on the routing agent's decision, we can see that `billing` agent never get any user's request. " ] }, { @@ -1603,7 +1629,7 @@ "generator_session_id = generator_agent.create_session(session_name=f\"generator_agent_{uuid.uuid4()}\")\n", "evaluator_session_id = evaluator_agent.create_session(session_name=f\"evaluator_agent_{uuid.uuid4()}\")\n", "\n", - "def generate_and_evaluate_loop(user_input):\n", + "def generator_evaluator_workflow(user_input):\n", " # Step 1: Generate a response\n", " generator_response = generator_agent.create_turn(\n", " messages=[\n", @@ -1644,7 +1670,7 @@ }, { "cell_type": "code", - "execution_count": 56, + "execution_count": null, "metadata": {}, "outputs": [ { @@ -1683,19 +1709,21 @@ "All operations should be O(1).\n", "\"\"\"\n", "\n", - "print(generate_and_evaluate_loop(coding_task)[\"response\"])" + "print(generator_evaluator_workflow(coding_task)[\"response\"])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "#### 2.1. Monitor Generator-Evaluator Internals" + "#### 2.1. Monitor Generator-Evaluator Internals\n", + "\n", + "In addition to final output from workflow, we can also look at how the generator and evaluator agents processed the user's request. Note that the `evaluator_agent` PASSED after 1 iteration. " ] }, { "cell_type": "code", - "execution_count": 57, + "execution_count": 102, "metadata": {}, "outputs": [ { @@ -1905,12 +1933,18 @@ "\n", "In the orchestrator-workers workflow, a central LLM dynamically breaks down tasks, delegates them to worker LLMs, and synthesizes their results.\n", "\n", - "![](https://www.anthropic.com/_next/image?url=https%3A%2F%2Fwww-cdn.anthropic.com%2Fimages%2F4zrzovbb%2Fwebsite%2F8985fc683fae4780fb34eab1365ab78c7e51bc8e-2401x1000.png&w=3840&q=75)" + "![](https://www.anthropic.com/_next/image?url=https%3A%2F%2Fwww-cdn.anthropic.com%2Fimages%2F4zrzovbb%2Fwebsite%2F8985fc683fae4780fb34eab1365ab78c7e51bc8e-2401x1000.png&w=3840&q=75)\n", + "\n", + "**Example: Content Generation**\n", + "\n", + "We'll showcase how to use the orchestrator-workers workflow to generate a content. \n", + "- **Orchestrator agent** analyzes the user's request and breaks it down into 2-3 distinct approaches\n", + "- **Worker agents** are spawn up by the orchestrator agent to generate content based on each approach" ] }, { "cell_type": "code", - "execution_count": 64, + "execution_count": 103, "metadata": {}, "outputs": [], "source": [ @@ -1946,7 +1980,7 @@ "\n", "worker_agent_config = AgentConfig({\n", " **base_agent_config,\n", - " \"instructions\": \"\"\"You will be given a ,