llama-stack-mirror/docs/notebooks/autogen/autogen_llama_stack_integration.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# AutoGen + Llama Stack Integration\n",
    "\n",
    "## Overview\n",
    "\n",
    "This notebook demonstrates how to use **AutoGen (AG2)** with **Llama Stack** as the backend.\n",
    "\n",
    "### Use Cases Covered:\n",
    "1. **Two-Agent Conversation** - UserProxy + Assistant solving a problem\n",
    "2. **Code Generation & Execution** - AutoGen generates and runs code\n",
    "3. **Group Chat** - Multiple specialists collaborating\n",
    "4. **Human-in-the-Loop** - Interactive problem-solving\n",
    "5. **Sequential Task Solving** - Math problem → Code → Execute → Verify\n",
    "\n",
    "---\n",
    "\n",
    "## Prerequisites\n",
    "\n",
    "```bash\n",
    "# Install AutoGen (AG2)\n",
    "pip install pyautogen\n",
    "\n",
    "# Llama Stack should already be running\n",
    "# Default: http://localhost:8321\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Imports\n",
    "import os\n",
    "from autogen import AssistantAgent, UserProxyAgent, GroupChat, GroupChatManager\n",
    "from autogen.oai import OpenAIWrapper\n",
    "\n",
    "# Check Llama Stack connectivity\n",
    "import httpx\n",
    "\n",
    "LLAMA_STACK_URL = \"http://localhost:8321\"\n",
    "\n",
    "try:\n",
    "    response = httpx.get(f\"{LLAMA_STACK_URL}/health\")\n",
    "    print(f\"✅ Llama Stack is running at {LLAMA_STACK_URL}\")\n",
    "    print(f\"Status: {response.status_code}\")\n",
    "except Exception as e:\n",
    "    print(f\"❌ Llama Stack not accessible: {e}\")\n",
    "    print(\"Make sure Llama Stack is running on port 8321\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Configuration: AutoGen with Llama Stack\n",
    "\n",
    "### How It Works\n",
    "\n",
    "AutoGen uses the **OpenAI API format**, which Llama Stack is compatible with!\n",
    "\n",
    "```python\n",
    "config_list = [\n",
    "    {\n",
    "        \"model\": \"ollama/llama3.3:70b\",  # Your Llama Stack model\n",
    "        \"base_url\": \"http://localhost:8321/v1\",  # Llama Stack endpoint\n",
    "        \"api_key\": \"not-needed\",  # Llama Stack doesn't need auth\n",
    "    }\n",
    "]\n",
    "```\n",
    "\n",
    "**Key Points:**\n",
    "- Use `/v1` suffix for OpenAI-compatible endpoint\n",
    "- `api_key` can be any string (Llama Stack ignores it)\n",
    "- `model` must match what's available in Llama Stack"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# AutoGen configuration for Llama Stack\n",
    "config_list = [\n",
    "    {\n",
    "        \"model\": \"ollama/llama3.3:70b\",  # Your Llama Stack model\n",
    "        \"base_url\": \"http://localhost:8321/v1\",  # OpenAI-compatible endpoint\n",
    "        \"api_key\": \"not-needed\",  # Llama Stack doesn't require auth\n",
    "    }\n",
    "]\n",
    "\n",
    "llm_config = {\n",
    "    \"config_list\": config_list,\n",
    "    \"temperature\": 0.7,\n",
    "    \"timeout\": 120,\n",
    "}\n",
    "\n",
    "print(\"✅ AutoGen configuration ready for Llama Stack\")\n",
    "print(f\"Model: {config_list[0]['model']}\")\n",
    "print(f\"Base URL: {config_list[0]['base_url']}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example 1: Two-Agent Conversation\n",
    "\n",
    "### Pattern: User Proxy + Assistant\n",
    "\n",
    "**UserProxyAgent:**\n",
    "- Represents the human user\n",
    "- Can execute code\n",
    "- Provides feedback to assistant\n",
    "\n",
    "**AssistantAgent:**\n",
    "- AI assistant powered by Llama Stack\n",
    "- Generates responses and code\n",
    "- Solves problems conversationally\n",
    "\n",
    "### Use Case: Solve a Math Problem"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create AssistantAgent (AI assistant)\n",
    "assistant = AssistantAgent(\n",
    "    name=\"MathAssistant\",\n",
    "    system_message=\"You are a helpful AI assistant that solves math problems. Provide clear explanations.\",\n",
    "    llm_config=llm_config,\n",
    ")\n",
    "\n",
    "# Create UserProxyAgent (represents human)\n",
    "user_proxy = UserProxyAgent(\n",
    "    name=\"User\",\n",
    "    human_input_mode=\"NEVER\",  # Fully automated (no human input)\n",
    "    max_consecutive_auto_reply=5,\n",
    "    code_execution_config={\"use_docker\": False},  # Allow local code execution\n",
    ")\n",
    "\n",
    "print(\"✅ Agents created\")\n",
    "print(f\"Assistant: {assistant.name}\")\n",
    "print(f\"User Proxy: {user_proxy.name}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Start conversation\n",
    "user_proxy.initiate_chat(\n",
    "    assistant,\n",
    "    message=\"What is the sum of the first 100 prime numbers? Please write Python code to calculate it.\"\n",
    ")\n",
    "\n",
    "print(\"\\n\" + \"=\"*50)\n",
    "print(\"Conversation complete!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example 2: Code Generation & Execution\n",
    "\n",
    "### Pattern: Assistant generates code → UserProxy executes it\n",
    "\n",
    "This is AutoGen's killer feature: **automatic code execution**!\n",
    "\n",
    "### Use Case: Data Analysis Task"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create a coding assistant\n",
    "coding_assistant = AssistantAgent(\n",
    "    name=\"DataScientist\",\n",
    "    system_message=\"\"\"You are an expert data scientist.\n",
    "    Write Python code to solve data analysis problems.\n",
    "    Always include visualizations when appropriate.\"\"\",\n",
    "    llm_config=llm_config,\n",
    ")\n",
    "\n",
    "# User proxy with code execution enabled\n",
    "user_proxy_code = UserProxyAgent(\n",
    "    name=\"UserProxy\",\n",
    "    human_input_mode=\"NEVER\",\n",
    "    max_consecutive_auto_reply=3,\n",
    "    code_execution_config={\n",
    "        \"work_dir\": \"coding\",\n",
    "        \"use_docker\": False,\n",
    "    },\n",
    ")\n",
    "\n",
    "# Start data analysis task\n",
    "user_proxy_code.initiate_chat(\n",
    "    coding_assistant,\n",
    "    message=\"\"\"Generate 100 random numbers from a normal distribution (mean=50, std=10).\n",
    "    Calculate the mean, median, and standard deviation.\n",
    "    Create a histogram to visualize the distribution.\"\"\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example 3: Group Chat (Multi-Agent Collaboration)\n",
    "\n",
    "### Pattern: Multiple Specialists Collaborating\n",
    "\n",
    "**Scenario:** Write a technical blog post about AI\n",
    "\n",
    "**Agents:**\n",
    "1. **Researcher** - Finds information\n",
    "2. **Writer** - Writes content\n",
    "3. **Critic** - Reviews and suggests improvements\n",
    "4. **UserProxy** - Orchestrates and provides final approval\n",
    "\n",
    "This is similar to llamacrew's workflow but **conversational** instead of DAG-based!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create specialist agents\n",
    "researcher = AssistantAgent(\n",
    "    name=\"Researcher\",\n",
    "    system_message=\"\"\"You are a researcher. Your job is to find accurate information\n",
    "    about topics and provide facts, statistics, and recent developments.\"\"\",\n",
    "    llm_config=llm_config,\n",
    ")\n",
    "\n",
    "writer = AssistantAgent(\n",
    "    name=\"Writer\",\n",
    "    system_message=\"\"\"You are a technical writer. Your job is to write clear,\n",
    "    engaging content based on research provided. Use simple language and examples.\"\"\",\n",
    "    llm_config=llm_config,\n",
    ")\n",
    "\n",
    "critic = AssistantAgent(\n",
    "    name=\"Critic\",\n",
    "    system_message=\"\"\"You are an editor. Review content for clarity, accuracy,\n",
    "    and engagement. Suggest specific improvements.\"\"\",\n",
    "    llm_config=llm_config,\n",
    ")\n",
    "\n",
    "# User proxy to orchestrate\n",
    "user_proxy_group = UserProxyAgent(\n",
    "    name=\"UserProxy\",\n",
    "    human_input_mode=\"NEVER\",\n",
    "    max_consecutive_auto_reply=10,\n",
    "    code_execution_config=False,\n",
    ")\n",
    "\n",
    "print(\"✅ Group chat agents created\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create group chat\n",
    "groupchat = GroupChat(\n",
    "    agents=[user_proxy_group, researcher, writer, critic],\n",
    "    messages=[],\n",
    "    max_round=12,  # Maximum conversation rounds\n",
    ")\n",
    "\n",
    "# Create manager to orchestrate\n",
    "manager = GroupChatManager(groupchat=groupchat, llm_config=llm_config)\n",
    "\n",
    "# Start group chat\n",
    "user_proxy_group.initiate_chat(\n",
    "    manager,\n",
    "    message=\"\"\"Write a 300-word blog post about the benefits of using\n",
    "    Llama Stack for LLM applications. Include:\n",
    "    1. What Llama Stack is\n",
    "    2. Key benefits\n",
    "    3. A simple use case example\n",
    "\n",
    "    Researcher: gather information\n",
    "    Writer: create the blog post\n",
    "    Critic: review and suggest improvements\n",
    "    \"\"\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example 4: Human-in-the-Loop\n",
    "\n",
    "### Pattern: Interactive Problem Solving\n",
    "\n",
    "Autogen excels at **human-in-the-loop** workflows where you can:\n",
    "- Provide feedback mid-conversation\n",
    "- Approve/reject suggestions\n",
    "- Guide the agent's direction\n",
    "\n",
    "**Note:** In notebooks, this requires `human_input_mode=\"ALWAYS\"` or `\"TERMINATE\"` and manual input."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Interactive assistant (uncomment to try)\n",
    "# WARNING: This will prompt for user input!\n",
    "\n",
    "# assistant_interactive = AssistantAgent(\n",
    "#     name=\"InteractiveAssistant\",\n",
    "#     system_message=\"You are a helpful assistant. Ask clarifying questions when needed.\",\n",
    "#     llm_config=llm_config,\n",
    "# )\n",
    "\n",
    "# user_proxy_interactive = UserProxyAgent(\n",
    "#     name=\"Human\",\n",
    "#     human_input_mode=\"TERMINATE\",  # Ask for human input when TERMINATE is mentioned\n",
    "#     max_consecutive_auto_reply=5,\n",
    "# )\n",
    "\n",
    "# user_proxy_interactive.initiate_chat(\n",
    "#     assistant_interactive,\n",
    "#     message=\"Help me plan a machine learning project for customer churn prediction.\"\n",
    "# )\n",
    "\n",
    "print(\"💡 Human-in-the-loop example (commented out to avoid blocking notebook execution)\")\n",
    "print(\"Uncomment the code above to try interactive mode!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example 5: Sequential Task Solving\n",
    "\n",
    "### Pattern: Chain of Thought Problem Solving\n",
    "\n",
    "**Scenario:** Solve a complex problem requiring multiple steps\n",
    "\n",
    "1. **Understand** the problem\n",
    "2. **Plan** the solution approach\n",
    "3. **Implement** the solution (code)\n",
    "4. **Execute** and verify\n",
    "5. **Explain** the results\n",
    "\n",
    "### Use Case: Fibonacci Sequence Analysis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create a reasoning assistant\n",
    "reasoning_assistant = AssistantAgent(\n",
    "    name=\"ReasoningAssistant\",\n",
    "    system_message=\"\"\"You are a problem-solving assistant.\n",
    "    For complex problems:\n",
    "    1. Break down the problem\n",
    "    2. Plan the solution step-by-step\n",
    "    3. Write clean, well-commented code\n",
    "    4. Explain results clearly\n",
    "    \"\"\",\n",
    "    llm_config=llm_config,\n",
    ")\n",
    "\n",
    "user_proxy_reasoning = UserProxyAgent(\n",
    "    name=\"User\",\n",
    "    human_input_mode=\"NEVER\",\n",
    "    max_consecutive_auto_reply=5,\n",
    "    code_execution_config={\"work_dir\": \"reasoning\", \"use_docker\": False},\n",
    ")\n",
    "\n",
    "# Complex problem requiring sequential reasoning\n",
    "user_proxy_reasoning.initiate_chat(\n",
    "    reasoning_assistant,\n",
    "    message=\"\"\"Find the first 20 Fibonacci numbers where the number is also a prime number.\n",
    "\n",
    "    Requirements:\n",
    "    1. Explain the approach\n",
    "    2. Write efficient Python code\n",
    "    3. Display the results in a table\n",
    "    4. Calculate what percentage of the first 100 Fibonacci numbers are prime\n",
    "    \"\"\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Comparison: AutoGen vs llamacrew\n",
    "\n",
    "### When to Use AutoGen + Llama Stack\n",
    "\n",
    "✅ **Use AutoGen when you need:**\n",
    "- **Conversational** interactions between agents\n",
    "- **Human-in-the-loop** workflows (interactive approval, feedback)\n",
    "- **Code generation & execution** (data analysis, scripting)\n",
    "- **Group discussions** (multiple agents debating, collaborating)\n",
    "- **Dynamic problem-solving** (unknown number of back-and-forth exchanges)\n",
    "- **Research/prototyping** (exploratory work)\n",
    "\n",
    "**Example Use Cases:**\n",
    "- Interactive coding assistant\n",
    "- Research assistant with human feedback\n",
    "- Multi-agent debate/discussion\n",
    "- Tutoring/educational applications\n",
    "- Dynamic customer support\n",
    "\n",
    "---\n",
    "\n",
    "### When to Use llamacrew\n",
    "\n",
    "✅ **Use llamacrew when you need:**\n",
    "- **Production workflows** (blog writing, data pipelines)\n",
    "- **Declarative DAGs** (predefined task dependencies)\n",
    "- **Automatic parallelization** (framework optimizes)\n",
    "- **Non-interactive automation** (scheduled jobs)\n",
    "- **Minimal dependencies** (lightweight deployment)\n",
    "- **Predictable workflows** (known steps)\n",
    "\n",
    "**Example Use Cases:**\n",
    "- Automated blog post generation\n",
    "- Data ETL pipelines\n",
    "- Report generation\n",
    "- Batch processing\n",
    "- Production automation\n",
    "\n",
    "---\n",
    "\n",
    "### They're Complementary!\n",
    "\n",
    "- **AutoGen**: Conversational, interactive, exploratory\n",
    "- **llamacrew**: Workflow, automated, production\n",
    "\n",
    "You might use **AutoGen for prototyping** then move to **llamacrew for production**!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Advanced: Custom Agent Behaviors\n",
    "\n",
    "### Pattern: Specialized Agent with Custom Logic\n",
    "\n",
    "You can create agents with custom behavior beyond just prompts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from typing import Dict, List, Union\n",
    "import re\n",
    "\n",
    "class CodeReviewAgent(AssistantAgent):\n",
    "    \"\"\"Custom agent that reviews code for specific patterns.\"\"\"\n",
    "\n",
    "    def __init__(self, *args, **kwargs):\n",
    "        super().__init__(*args, **kwargs)\n",
    "        self.issues_found = []\n",
    "\n",
    "    def review_code(self, code: str) -> Dict[str, List[str]]:\n",
    "        \"\"\"Custom method to review code for common issues.\"\"\"\n",
    "        issues = []\n",
    "\n",
    "        # Check for common issues\n",
    "        if \"TODO\" in code or \"FIXME\" in code:\n",
    "            issues.append(\"Code contains TODO/FIXME comments\")\n",
    "\n",
    "        if not re.search(r'def \\w+\\(.*\\):', code):\n",
    "            issues.append(\"No function definitions found\")\n",
    "\n",
    "        if \"print(\" in code:\n",
    "            issues.append(\"Contains print statements (consider logging)\")\n",
    "\n",
    "        self.issues_found.extend(issues)\n",
    "        return {\"issues\": issues, \"total\": len(issues)}\n",
    "\n",
    "# Create custom reviewer\n",
    "code_reviewer = CodeReviewAgent(\n",
    "    name=\"CodeReviewer\",\n",
    "    system_message=\"\"\"You are a code reviewer. Analyze code for:\n",
    "    - Code quality\n",
    "    - Best practices\n",
    "    - Potential bugs\n",
    "    - Performance issues\n",
    "    Provide specific, actionable feedback.\n",
    "    \"\"\",\n",
    "    llm_config=llm_config,\n",
    ")\n",
    "\n",
    "print(\"✅ Custom CodeReviewAgent created with specialized review logic\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Performance Tips\n",
    "\n",
    "### 1. Model Selection\n",
    "\n",
    "```python\n",
    "# Fast models for simple tasks\n",
    "config_fast = {\n",
    "    \"model\": \"ollama/llama3.2:3b\",  # Smaller, faster\n",
    "    \"temperature\": 0.5,\n",
    "}\n",
    "\n",
    "# Powerful models for complex reasoning\n",
    "config_powerful = {\n",
    "    \"model\": \"ollama/llama3.3:70b\",  # Larger, better quality\n",
    "    \"temperature\": 0.7,\n",
    "}\n",
    "```\n",
    "\n",
    "### 2. Limit Conversation Rounds\n",
    "\n",
    "```python\n",
    "user_proxy = UserProxyAgent(\n",
    "    name=\"User\",\n",
    "    max_consecutive_auto_reply=3,  # Prevent infinite loops\n",
    ")\n",
    "```\n",
    "\n",
    "### 3. Set Timeouts\n",
    "\n",
    "```python\n",
    "llm_config = {\n",
    "    \"timeout\": 60,  # 60 second timeout per request\n",
    "    \"config_list\": config_list,\n",
    "}\n",
    "```\n",
    "\n",
    "### 4. Use Work Directories\n",
    "\n",
    "```python\n",
    "code_execution_config = {\n",
    "    \"work_dir\": \"autogen_workspace\",  # Isolate generated files\n",
    "    \"use_docker\": False,\n",
    "}\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Troubleshooting\n",
    "\n",
    "### Common Issues\n",
    "\n",
    "#### 1. \"Could not connect to Llama Stack\"\n",
    "\n",
    "```bash\n",
    "# Check if Llama Stack is running\n",
    "curl http://localhost:8321/health\n",
    "\n",
    "# Start Llama Stack if needed\n",
    "llama stack run\n",
    "```\n",
    "\n",
    "#### 2. \"Model not found\"\n",
    "\n",
    "```bash\n",
    "# List available models\n",
    "curl http://localhost:8321/models\n",
    "\n",
    "# Make sure model name matches exactly:\n",
    "# ✅ \"ollama/llama3.3:70b\"\n",
    "# ❌ \"llama3.3:70b\"\n",
    "```\n",
    "\n",
    "#### 3. \"Agent not responding\"\n",
    "\n",
    "- Check `max_consecutive_auto_reply` isn't set too low\n",
    "- Increase `timeout` in `llm_config`\n",
    "- Verify Llama Stack model is loaded and warm\n",
    "\n",
    "#### 4. \"Code execution failed\"\n",
    "\n",
    "- Make sure `code_execution_config` is set correctly\n",
    "- Check file permissions on `work_dir`\n",
    "- Install required Python packages\n",
    "\n",
    "---\n",
    "\n",
    "### Debug Mode\n",
    "\n",
    "```python\n",
    "import logging\n",
    "\n",
    "# Enable AutoGen debug logging\n",
    "logging.basicConfig(level=logging.DEBUG)\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summary\n",
    "\n",
    "### What We Covered\n",
    "\n",
    "1. ✅ **Two-Agent Conversations** - UserProxy + Assistant pattern\n",
    "2. ✅ **Code Generation & Execution** - AutoGen's killer feature\n",
    "3. ✅ **Group Chat** - Multiple agents collaborating\n",
    "4. ✅ **Human-in-the-Loop** - Interactive workflows\n",
    "5. ✅ **Sequential Reasoning** - Complex problem solving\n",
    "6. ✅ **Custom Agents** - Specialized behaviors\n",
    "\n",
    "### Key Takeaways\n",
    "\n",
    "**AutoGen + Llama Stack is powerful for:**\n",
    "- 🗣️ **Conversational** multi-agent systems\n",
    "- 👤 **Interactive** problem-solving with humans\n",
    "- 💻 **Code generation** and execution\n",
    "- 🤝 **Collaborative** agent discussions\n",
    "\n",
    "**vs llamacrew which is better for:**\n",
    "- 🔄 **Production workflows** and pipelines\n",
    "- 📊 **Declarative** task orchestration\n",
    "- ⚡ **Automatic parallelization**\n",
    "- 🤖 **Non-interactive** automation\n",
    "\n",
    "---\n",
    "\n",
    "### Next Steps\n",
    "\n",
    "1. Experiment with different agent combinations\n",
    "2. Try human-in-the-loop workflows\n",
    "3. Build custom agents for your use case\n",
    "4. Compare AutoGen vs llamacrew for your specific needs\n",
    "\n",
    "### Resources\n",
    "\n",
    "- **AutoGen Docs**: https://microsoft.github.io/autogen/\n",
    "- **Llama Stack Docs**: https://llama-stack.readthedocs.io/\n",
    "- **llamacrew Docs**: `/home/omara/Desktop/llamacrew/README.md`\n",
    "\n",
    "---\n",
    "\n",
    "**Happy multi-agent building! 🚀**"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}