diff --git a/docs/notebooks/autogen/autogen_llama_stack_integration.ipynb b/docs/notebooks/autogen/autogen_llama_stack_integration.ipynb new file mode 100644 index 000000000..def4f04ac --- /dev/null +++ b/docs/notebooks/autogen/autogen_llama_stack_integration.ipynb @@ -0,0 +1,698 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# AutoGen + Llama Stack Integration\n", + "\n", + "## Overview\n", + "\n", + "This notebook demonstrates how to use **AutoGen (AG2)** with **Llama Stack** as the backend.\n", + "\n", + "### What is AutoGen?\n", + "- Microsoft's framework for **conversational multi-agent** systems\n", + "- Emphasizes **chat-based** interactions between agents\n", + "- Built-in **code execution** and **human-in-the-loop**\n", + "- Great for **interactive problem-solving**\n", + "\n", + "### Why Llama Stack?\n", + "- **Unified backend** for any LLM (Ollama, Together, vLLM, etc.)\n", + "- **One integration point** instead of many\n", + "- **Production-ready** infrastructure\n", + "- **Open-source** and flexible\n", + "\n", + "### Use Cases Covered:\n", + "1. **Two-Agent Conversation** - UserProxy + Assistant solving a problem\n", + "2. **Code Generation & Execution** - AutoGen generates and runs code\n", + "3. **Group Chat** - Multiple specialists collaborating\n", + "4. **Human-in-the-Loop** - Interactive problem-solving\n", + "5. **Sequential Task Solving** - Math problem → Code → Execute → Verify\n", + "\n", + "---\n", + "\n", + "## Prerequisites\n", + "\n", + "```bash\n", + "# Install AutoGen (AG2)\n", + "pip install pyautogen\n", + "\n", + "# Llama Stack should already be running\n", + "# Default: http://localhost:8321\n", + "```" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Imports\n", + "import os\n", + "from autogen import AssistantAgent, UserProxyAgent, GroupChat, GroupChatManager\n", + "from autogen.oai import OpenAIWrapper\n", + "\n", + "# Check Llama Stack connectivity\n", + "import httpx\n", + "\n", + "LLAMA_STACK_URL = \"http://localhost:8321\"\n", + "\n", + "try:\n", + " response = httpx.get(f\"{LLAMA_STACK_URL}/health\")\n", + " print(f\"✅ Llama Stack is running at {LLAMA_STACK_URL}\")\n", + " print(f\"Status: {response.status_code}\")\n", + "except Exception as e:\n", + " print(f\"❌ Llama Stack not accessible: {e}\")\n", + " print(\"Make sure Llama Stack is running on port 8321\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Configuration: AutoGen with Llama Stack\n", + "\n", + "### How It Works\n", + "\n", + "AutoGen uses the **OpenAI API format**, which Llama Stack is compatible with!\n", + "\n", + "```python\n", + "config_list = [\n", + " {\n", + " \"model\": \"ollama/llama3.3:70b\", # Your Llama Stack model\n", + " \"base_url\": \"http://localhost:8321/v1\", # Llama Stack endpoint\n", + " \"api_key\": \"not-needed\", # Llama Stack doesn't need auth\n", + " }\n", + "]\n", + "```\n", + "\n", + "**Key Points:**\n", + "- Use `/v1` suffix for OpenAI-compatible endpoint\n", + "- `api_key` can be any string (Llama Stack ignores it)\n", + "- `model` must match what's available in Llama Stack" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# AutoGen configuration for Llama Stack\n", + "config_list = [\n", + " {\n", + " \"model\": \"ollama/llama3.3:70b\", # Your Llama Stack model\n", + " \"base_url\": \"http://localhost:8321/v1\", # OpenAI-compatible endpoint\n", + " \"api_key\": \"not-needed\", # Llama Stack doesn't require auth\n", + " }\n", + "]\n", + "\n", + "llm_config = {\n", + " \"config_list\": config_list,\n", + " \"temperature\": 0.7,\n", + " \"timeout\": 120,\n", + "}\n", + "\n", + "print(\"✅ AutoGen configuration ready for Llama Stack\")\n", + "print(f\"Model: {config_list[0]['model']}\")\n", + "print(f\"Base URL: {config_list[0]['base_url']}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Example 1: Two-Agent Conversation\n", + "\n", + "### Pattern: User Proxy + Assistant\n", + "\n", + "**UserProxyAgent:**\n", + "- Represents the human user\n", + "- Can execute code\n", + "- Provides feedback to assistant\n", + "\n", + "**AssistantAgent:**\n", + "- AI assistant powered by Llama Stack\n", + "- Generates responses and code\n", + "- Solves problems conversationally\n", + "\n", + "### Use Case: Solve a Math Problem" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Create AssistantAgent (AI assistant)\n", + "assistant = AssistantAgent(\n", + " name=\"MathAssistant\",\n", + " system_message=\"You are a helpful AI assistant that solves math problems. Provide clear explanations.\",\n", + " llm_config=llm_config,\n", + ")\n", + "\n", + "# Create UserProxyAgent (represents human)\n", + "user_proxy = UserProxyAgent(\n", + " name=\"User\",\n", + " human_input_mode=\"NEVER\", # Fully automated (no human input)\n", + " max_consecutive_auto_reply=5,\n", + " code_execution_config={\"use_docker\": False}, # Allow local code execution\n", + ")\n", + "\n", + "print(\"✅ Agents created\")\n", + "print(f\"Assistant: {assistant.name}\")\n", + "print(f\"User Proxy: {user_proxy.name}\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Start conversation\n", + "user_proxy.initiate_chat(\n", + " assistant,\n", + " message=\"What is the sum of the first 100 prime numbers? Please write Python code to calculate it.\"\n", + ")\n", + "\n", + "print(\"\\n\" + \"=\"*50)\n", + "print(\"Conversation complete!\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Example 2: Code Generation & Execution\n", + "\n", + "### Pattern: Assistant generates code → UserProxy executes it\n", + "\n", + "This is AutoGen's killer feature: **automatic code execution**!\n", + "\n", + "### Use Case: Data Analysis Task" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Create a coding assistant\n", + "coding_assistant = AssistantAgent(\n", + " name=\"DataScientist\",\n", + " system_message=\"\"\"You are an expert data scientist.\n", + " Write Python code to solve data analysis problems.\n", + " Always include visualizations when appropriate.\"\"\",\n", + " llm_config=llm_config,\n", + ")\n", + "\n", + "# User proxy with code execution enabled\n", + "user_proxy_code = UserProxyAgent(\n", + " name=\"UserProxy\",\n", + " human_input_mode=\"NEVER\",\n", + " max_consecutive_auto_reply=3,\n", + " code_execution_config={\n", + " \"work_dir\": \"coding\",\n", + " \"use_docker\": False,\n", + " },\n", + ")\n", + "\n", + "# Start data analysis task\n", + "user_proxy_code.initiate_chat(\n", + " coding_assistant,\n", + " message=\"\"\"Generate 100 random numbers from a normal distribution (mean=50, std=10).\n", + " Calculate the mean, median, and standard deviation.\n", + " Create a histogram to visualize the distribution.\"\"\"\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Example 3: Group Chat (Multi-Agent Collaboration)\n", + "\n", + "### Pattern: Multiple Specialists Collaborating\n", + "\n", + "**Scenario:** Write a technical blog post about AI\n", + "\n", + "**Agents:**\n", + "1. **Researcher** - Finds information\n", + "2. **Writer** - Writes content\n", + "3. **Critic** - Reviews and suggests improvements\n", + "4. **UserProxy** - Orchestrates and provides final approval\n", + "\n", + "This is similar to llamacrew's workflow but **conversational** instead of DAG-based!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Create specialist agents\n", + "researcher = AssistantAgent(\n", + " name=\"Researcher\",\n", + " system_message=\"\"\"You are a researcher. Your job is to find accurate information\n", + " about topics and provide facts, statistics, and recent developments.\"\"\",\n", + " llm_config=llm_config,\n", + ")\n", + "\n", + "writer = AssistantAgent(\n", + " name=\"Writer\",\n", + " system_message=\"\"\"You are a technical writer. Your job is to write clear,\n", + " engaging content based on research provided. Use simple language and examples.\"\"\",\n", + " llm_config=llm_config,\n", + ")\n", + "\n", + "critic = AssistantAgent(\n", + " name=\"Critic\",\n", + " system_message=\"\"\"You are an editor. Review content for clarity, accuracy,\n", + " and engagement. Suggest specific improvements.\"\"\",\n", + " llm_config=llm_config,\n", + ")\n", + "\n", + "# User proxy to orchestrate\n", + "user_proxy_group = UserProxyAgent(\n", + " name=\"UserProxy\",\n", + " human_input_mode=\"NEVER\",\n", + " max_consecutive_auto_reply=10,\n", + " code_execution_config=False,\n", + ")\n", + "\n", + "print(\"✅ Group chat agents created\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Create group chat\n", + "groupchat = GroupChat(\n", + " agents=[user_proxy_group, researcher, writer, critic],\n", + " messages=[],\n", + " max_round=12, # Maximum conversation rounds\n", + ")\n", + "\n", + "# Create manager to orchestrate\n", + "manager = GroupChatManager(groupchat=groupchat, llm_config=llm_config)\n", + "\n", + "# Start group chat\n", + "user_proxy_group.initiate_chat(\n", + " manager,\n", + " message=\"\"\"Write a 300-word blog post about the benefits of using\n", + " Llama Stack for LLM applications. Include:\n", + " 1. What Llama Stack is\n", + " 2. Key benefits\n", + " 3. A simple use case example\n", + "\n", + " Researcher: gather information\n", + " Writer: create the blog post\n", + " Critic: review and suggest improvements\n", + " \"\"\"\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Example 4: Human-in-the-Loop\n", + "\n", + "### Pattern: Interactive Problem Solving\n", + "\n", + "Autogen excels at **human-in-the-loop** workflows where you can:\n", + "- Provide feedback mid-conversation\n", + "- Approve/reject suggestions\n", + "- Guide the agent's direction\n", + "\n", + "**Note:** In notebooks, this requires `human_input_mode=\"ALWAYS\"` or `\"TERMINATE\"` and manual input." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Interactive assistant (uncomment to try)\n", + "# WARNING: This will prompt for user input!\n", + "\n", + "# assistant_interactive = AssistantAgent(\n", + "# name=\"InteractiveAssistant\",\n", + "# system_message=\"You are a helpful assistant. Ask clarifying questions when needed.\",\n", + "# llm_config=llm_config,\n", + "# )\n", + "\n", + "# user_proxy_interactive = UserProxyAgent(\n", + "# name=\"Human\",\n", + "# human_input_mode=\"TERMINATE\", # Ask for human input when TERMINATE is mentioned\n", + "# max_consecutive_auto_reply=5,\n", + "# )\n", + "\n", + "# user_proxy_interactive.initiate_chat(\n", + "# assistant_interactive,\n", + "# message=\"Help me plan a machine learning project for customer churn prediction.\"\n", + "# )\n", + "\n", + "print(\"💡 Human-in-the-loop example (commented out to avoid blocking notebook execution)\")\n", + "print(\"Uncomment the code above to try interactive mode!\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Example 5: Sequential Task Solving\n", + "\n", + "### Pattern: Chain of Thought Problem Solving\n", + "\n", + "**Scenario:** Solve a complex problem requiring multiple steps\n", + "\n", + "1. **Understand** the problem\n", + "2. **Plan** the solution approach\n", + "3. **Implement** the solution (code)\n", + "4. **Execute** and verify\n", + "5. **Explain** the results\n", + "\n", + "### Use Case: Fibonacci Sequence Analysis" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Create a reasoning assistant\n", + "reasoning_assistant = AssistantAgent(\n", + " name=\"ReasoningAssistant\",\n", + " system_message=\"\"\"You are a problem-solving assistant.\n", + " For complex problems:\n", + " 1. Break down the problem\n", + " 2. Plan the solution step-by-step\n", + " 3. Write clean, well-commented code\n", + " 4. Explain results clearly\n", + " \"\"\",\n", + " llm_config=llm_config,\n", + ")\n", + "\n", + "user_proxy_reasoning = UserProxyAgent(\n", + " name=\"User\",\n", + " human_input_mode=\"NEVER\",\n", + " max_consecutive_auto_reply=5,\n", + " code_execution_config={\"work_dir\": \"reasoning\", \"use_docker\": False},\n", + ")\n", + "\n", + "# Complex problem requiring sequential reasoning\n", + "user_proxy_reasoning.initiate_chat(\n", + " reasoning_assistant,\n", + " message=\"\"\"Find the first 20 Fibonacci numbers where the number is also a prime number.\n", + "\n", + " Requirements:\n", + " 1. Explain the approach\n", + " 2. Write efficient Python code\n", + " 3. Display the results in a table\n", + " 4. Calculate what percentage of the first 100 Fibonacci numbers are prime\n", + " \"\"\"\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Comparison: AutoGen vs llamacrew\n", + "\n", + "### When to Use AutoGen + Llama Stack\n", + "\n", + "✅ **Use AutoGen when you need:**\n", + "- **Conversational** interactions between agents\n", + "- **Human-in-the-loop** workflows (interactive approval, feedback)\n", + "- **Code generation & execution** (data analysis, scripting)\n", + "- **Group discussions** (multiple agents debating, collaborating)\n", + "- **Dynamic problem-solving** (unknown number of back-and-forth exchanges)\n", + "- **Research/prototyping** (exploratory work)\n", + "\n", + "**Example Use Cases:**\n", + "- Interactive coding assistant\n", + "- Research assistant with human feedback\n", + "- Multi-agent debate/discussion\n", + "- Tutoring/educational applications\n", + "- Dynamic customer support\n", + "\n", + "---\n", + "\n", + "### When to Use llamacrew\n", + "\n", + "✅ **Use llamacrew when you need:**\n", + "- **Production workflows** (blog writing, data pipelines)\n", + "- **Declarative DAGs** (predefined task dependencies)\n", + "- **Automatic parallelization** (framework optimizes)\n", + "- **Non-interactive automation** (scheduled jobs)\n", + "- **Minimal dependencies** (lightweight deployment)\n", + "- **Predictable workflows** (known steps)\n", + "\n", + "**Example Use Cases:**\n", + "- Automated blog post generation\n", + "- Data ETL pipelines\n", + "- Report generation\n", + "- Batch processing\n", + "- Production automation\n", + "\n", + "---\n", + "\n", + "### They're Complementary!\n", + "\n", + "- **AutoGen**: Conversational, interactive, exploratory\n", + "- **llamacrew**: Workflow, automated, production\n", + "\n", + "You might use **AutoGen for prototyping** then move to **llamacrew for production**!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Advanced: Custom Agent Behaviors\n", + "\n", + "### Pattern: Specialized Agent with Custom Logic\n", + "\n", + "You can create agents with custom behavior beyond just prompts." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from typing import Dict, List, Union\n", + "import re\n", + "\n", + "class CodeReviewAgent(AssistantAgent):\n", + " \"\"\"Custom agent that reviews code for specific patterns.\"\"\"\n", + "\n", + " def __init__(self, *args, **kwargs):\n", + " super().__init__(*args, **kwargs)\n", + " self.issues_found = []\n", + "\n", + " def review_code(self, code: str) -> Dict[str, List[str]]:\n", + " \"\"\"Custom method to review code for common issues.\"\"\"\n", + " issues = []\n", + "\n", + " # Check for common issues\n", + " if \"TODO\" in code or \"FIXME\" in code:\n", + " issues.append(\"Code contains TODO/FIXME comments\")\n", + "\n", + " if not re.search(r'def \\w+\\(.*\\):', code):\n", + " issues.append(\"No function definitions found\")\n", + "\n", + " if \"print(\" in code:\n", + " issues.append(\"Contains print statements (consider logging)\")\n", + "\n", + " self.issues_found.extend(issues)\n", + " return {\"issues\": issues, \"total\": len(issues)}\n", + "\n", + "# Create custom reviewer\n", + "code_reviewer = CodeReviewAgent(\n", + " name=\"CodeReviewer\",\n", + " system_message=\"\"\"You are a code reviewer. Analyze code for:\n", + " - Code quality\n", + " - Best practices\n", + " - Potential bugs\n", + " - Performance issues\n", + " Provide specific, actionable feedback.\n", + " \"\"\",\n", + " llm_config=llm_config,\n", + ")\n", + "\n", + "print(\"✅ Custom CodeReviewAgent created with specialized review logic\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Performance Tips\n", + "\n", + "### 1. Model Selection\n", + "\n", + "```python\n", + "# Fast models for simple tasks\n", + "config_fast = {\n", + " \"model\": \"ollama/llama3.2:3b\", # Smaller, faster\n", + " \"temperature\": 0.5,\n", + "}\n", + "\n", + "# Powerful models for complex reasoning\n", + "config_powerful = {\n", + " \"model\": \"ollama/llama3.3:70b\", # Larger, better quality\n", + " \"temperature\": 0.7,\n", + "}\n", + "```\n", + "\n", + "### 2. Limit Conversation Rounds\n", + "\n", + "```python\n", + "user_proxy = UserProxyAgent(\n", + " name=\"User\",\n", + " max_consecutive_auto_reply=3, # Prevent infinite loops\n", + ")\n", + "```\n", + "\n", + "### 3. Set Timeouts\n", + "\n", + "```python\n", + "llm_config = {\n", + " \"timeout\": 60, # 60 second timeout per request\n", + " \"config_list\": config_list,\n", + "}\n", + "```\n", + "\n", + "### 4. Use Work Directories\n", + "\n", + "```python\n", + "code_execution_config = {\n", + " \"work_dir\": \"autogen_workspace\", # Isolate generated files\n", + " \"use_docker\": False,\n", + "}\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Troubleshooting\n", + "\n", + "### Common Issues\n", + "\n", + "#### 1. \"Could not connect to Llama Stack\"\n", + "\n", + "```bash\n", + "# Check if Llama Stack is running\n", + "curl http://localhost:8321/health\n", + "\n", + "# Start Llama Stack if needed\n", + "llama stack run\n", + "```\n", + "\n", + "#### 2. \"Model not found\"\n", + "\n", + "```bash\n", + "# List available models\n", + "curl http://localhost:8321/models\n", + "\n", + "# Make sure model name matches exactly:\n", + "# ✅ \"ollama/llama3.3:70b\"\n", + "# ❌ \"llama3.3:70b\"\n", + "```\n", + "\n", + "#### 3. \"Agent not responding\"\n", + "\n", + "- Check `max_consecutive_auto_reply` isn't set too low\n", + "- Increase `timeout` in `llm_config`\n", + "- Verify Llama Stack model is loaded and warm\n", + "\n", + "#### 4. \"Code execution failed\"\n", + "\n", + "- Make sure `code_execution_config` is set correctly\n", + "- Check file permissions on `work_dir`\n", + "- Install required Python packages\n", + "\n", + "---\n", + "\n", + "### Debug Mode\n", + "\n", + "```python\n", + "import logging\n", + "\n", + "# Enable AutoGen debug logging\n", + "logging.basicConfig(level=logging.DEBUG)\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Summary\n", + "\n", + "### What We Covered\n", + "\n", + "1. ✅ **Two-Agent Conversations** - UserProxy + Assistant pattern\n", + "2. ✅ **Code Generation & Execution** - AutoGen's killer feature\n", + "3. ✅ **Group Chat** - Multiple agents collaborating\n", + "4. ✅ **Human-in-the-Loop** - Interactive workflows\n", + "5. ✅ **Sequential Reasoning** - Complex problem solving\n", + "6. ✅ **Custom Agents** - Specialized behaviors\n", + "\n", + "### Key Takeaways\n", + "\n", + "**AutoGen + Llama Stack is powerful for:**\n", + "- 🗣️ **Conversational** multi-agent systems\n", + "- 👤 **Interactive** problem-solving with humans\n", + "- 💻 **Code generation** and execution\n", + "- 🤝 **Collaborative** agent discussions\n", + "\n", + "**vs llamacrew which is better for:**\n", + "- 🔄 **Production workflows** and pipelines\n", + "- 📊 **Declarative** task orchestration\n", + "- ⚡ **Automatic parallelization**\n", + "- 🤖 **Non-interactive** automation\n", + "\n", + "---\n", + "\n", + "### Next Steps\n", + "\n", + "1. Experiment with different agent combinations\n", + "2. Try human-in-the-loop workflows\n", + "3. Build custom agents for your use case\n", + "4. Compare AutoGen vs llamacrew for your specific needs\n", + "\n", + "### Resources\n", + "\n", + "- **AutoGen Docs**: https://microsoft.github.io/autogen/\n", + "- **Llama Stack Docs**: https://llama-stack.readthedocs.io/\n", + "- **llamacrew Docs**: `/home/omara/Desktop/llamacrew/README.md`\n", + "\n", + "---\n", + "\n", + "**Happy multi-agent building! 🚀**" + ] + } + ], + "metadata": { + "orig_nbformat": 4 + }, + "nbformat": 4, + "nbformat_minor": 2 +}