llama-stack-mirror/docs/notebooks/crewai/Llama_Stack_CrewAI.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "2ktr5ls2cas",
   "metadata": {
    "id": "2ktr5ls2cas"
   },
   "source": [
    "## LlamaStack + CrewAI Integration Tutorial\n",
    "\n",
    "This notebook guides you through integrating **LlamaStack** with **CrewAI** to build a complete Retrieval-Augmented Generation (RAG) system.\n",
    "\n",
    "### Overview\n",
    "\n",
    "- **LlamaStack**: Provides the infrastructure for running LLMs and vector databases.\n",
    "- **CrewAI**: Offers a framework for orchestrating agents and tasks.\n",
    "- **Integration**: Leverages LlamaStack's OpenAI-compatible API with CrewAI.\n",
    "\n",
    "### What You Will Learn\n",
    "\n",
    "1.  How to set up and start the LlamaStack server using the Together AI provider.\n",
    "2.  How to create and manage vector databases within LlamaStack.\n",
    "3.  How to build RAG tool with CrewAI by utilizing the LlamaStack server.\n",
    "4.  How to query the RAG tool for effective information retrieval and generation.\n",
    "\n",
    "### Prerequisites\n",
    "\n",
    "A Together AI API key is required to run the examples in this notebook.\n",
    "\n",
    "---\n",
    "\n",
    "### 1. Installation and Setup\n",
    "#### Install Required Dependencies\n",
    "\n",
    "Begin by installing all necessary packages for CrewAI integration. Ensure your `TOGETHER_API_KEY` is set as an environment variable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "5b6a6a17-b931-4bea-8273-0d6e5563637a",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "5b6a6a17-b931-4bea-8273-0d6e5563637a",
    "outputId": "a6427234-b75d-40ea-a471-8c7e9acb7d88",
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Requirement already satisfied: uv in /Users/kaiwu/miniconda3/lib/python3.12/site-packages (0.8.11)\n",
      "`\u001b[36mcrewai\u001b[39m` is already installed\n",
      "Not in Google Colab environment\n"
     ]
    },
    {
     "name": "stdin",
     "output_type": "stream",
     "text": [
      "TOGETHER_API_KEY environment variable is not set. Please enter your API key:  ········\n"
     ]
    }
   ],
   "source": [
    "!pip install uv\n",
    "!uv tool install crewai\n",
    "import os\n",
    "import getpass\n",
    "\n",
    "try:\n",
    "    from google.colab import userdata\n",
    "    os.environ['TOGETHER_API_KEY'] = userdata.get('TOGETHER_API_KEY')\n",
    "except ImportError:\n",
    "    print(\"Not in Google Colab environment\")\n",
    "\n",
    "for key in ['TOGETHER_API_KEY']:\n",
    "    try:\n",
    "        api_key = os.environ[key]\n",
    "        if not api_key:\n",
    "            raise ValueError(f\"{key} environment variable is empty\")\n",
    "    except KeyError:\n",
    "        api_key = getpass.getpass(f\"{key} environment variable is not set. Please enter your API key: \")\n",
    "        os.environ[key] = api_key"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "wmt9jvqzh7n",
   "metadata": {
    "id": "wmt9jvqzh7n"
   },
   "source": [
    "### 2. LlamaStack Server Setup\n",
    "\n",
    "#### Build and Start LlamaStack Server\n",
    "\n",
    "This section sets up the LlamaStack server with:\n",
    "- **Together AI** as the inference provider\n",
    "- **FAISS** as the vector database\n",
    "- **Sentence Transformers** for embeddings\n",
    "\n",
    "The server runs on `localhost:8321` and provides OpenAI-compatible endpoints."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "dd2dacf3-ec8b-4cc7-8ff4-b5b6ea4a6e9e",
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 773
    },
    "id": "dd2dacf3-ec8b-4cc7-8ff4-b5b6ea4a6e9e",
    "outputId": "aa53f96a-6826-4bfb-d1aa-2c0ec2dd4893",
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Environment '/Users/kaiwu/work/kaiwu/llama-stack/.venv' already exists, re-using it.\n",
      "Virtual environment /Users/kaiwu/work/kaiwu/llama-stack/.venv is already active\n",
      "\u001b[2mUsing Python 3.12.9 environment at: /Users/kaiwu/work/kaiwu/llama-stack/.venv\u001b[0m\n",
      "\u001b[2mAudited \u001b[1m1 package\u001b[0m \u001b[2min 206ms\u001b[0m\u001b[0m\n",
      "Installing pip dependencies\n",
      "\u001b[2mUsing Python 3.12.9 environment at: /Users/kaiwu/work/kaiwu/llama-stack/.venv\u001b[0m\n",
      "\u001b[2K\u001b[2mResolved \u001b[1m185 packages\u001b[0m \u001b[2min 2.02s\u001b[0m\u001b[0m                                       \u001b[0m\n",
      "\u001b[2mUninstalled \u001b[1m3 packages\u001b[0m \u001b[2min 247ms\u001b[0m\u001b[0m\n",
      "\u001b[2K\u001b[2mInstalled \u001b[1m3 packages\u001b[0m \u001b[2min 97ms\u001b[0m\u001b[0m                                \u001b[0m\n",
      " \u001b[31m-\u001b[39m \u001b[1mprotobuf\u001b[0m\u001b[2m==5.29.5\u001b[0m\n",
      " \u001b[32m+\u001b[39m \u001b[1mprotobuf\u001b[0m\u001b[2m==5.29.4\u001b[0m\n",
      " \u001b[31m-\u001b[39m \u001b[1mruamel-yaml\u001b[0m\u001b[2m==0.18.14\u001b[0m\n",
      " \u001b[32m+\u001b[39m \u001b[1mruamel-yaml\u001b[0m\u001b[2m==0.17.40\u001b[0m\n",
      " \u001b[31m-\u001b[39m \u001b[1mruff\u001b[0m\u001b[2m==0.12.5\u001b[0m\n",
      " \u001b[32m+\u001b[39m \u001b[1mruff\u001b[0m\u001b[2m==0.9.10\u001b[0m\n",
      "Installing special provider module: torch torchtune>=0.5.0 torchao>=0.12.0 --extra-index-url https://download.pytorch.org/whl/cpu\n",
      "\u001b[2mUsing Python 3.12.9 environment at: /Users/kaiwu/work/kaiwu/llama-stack/.venv\u001b[0m\n",
      "\u001b[2mAudited \u001b[1m3 packages\u001b[0m \u001b[2min 169ms\u001b[0m\u001b[0m\n",
      "Installing special provider module: sentence-transformers --no-deps\n",
      "\u001b[2mUsing Python 3.12.9 environment at: /Users/kaiwu/work/kaiwu/llama-stack/.venv\u001b[0m\n",
      "\u001b[2mAudited \u001b[1m1 package\u001b[0m \u001b[2min 43ms\u001b[0m\u001b[0m\n",
      "Installing special provider module: torch torchvision torchao>=0.12.0 --extra-index-url https://download.pytorch.org/whl/cpu\n",
      "\u001b[2mUsing Python 3.12.9 environment at: /Users/kaiwu/work/kaiwu/llama-stack/.venv\u001b[0m\n",
      "\u001b[2mAudited \u001b[1m3 packages\u001b[0m \u001b[2min 29ms\u001b[0m\u001b[0m\n",
      "\u001b[32mBuild Successful!\u001b[0m\n",
      "\u001b[34mYou can find the newly-built distribution here: /Users/kaiwu/.llama/distributions/starter/starter-run.yaml\u001b[0m\n",
      "\u001b[32mYou can run the new Llama Stack distro via: \u001b[34mllama stack run /Users/kaiwu/.llama/distributions/starter/starter-run.yaml --image-type venv\u001b[0m\u001b[0m\n",
      "Starting Llama Stack server with PID: 97967\n",
      "Waiting for server to start.................\n",
      "Server is ready!\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "import subprocess\n",
    "import time\n",
    "\n",
    "\n",
    "if \"UV_SYSTEM_PYTHON\" in os.environ:\n",
    "    del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
    "\n",
    "# this command installs all the dependencies needed for the llama stack server with the together inference provider\n",
    "!uv run --with llama-stack llama stack build --distro starter --image-type venv\n",
    "\n",
    "\n",
    "def run_llama_stack_server_background():\n",
    "    log_file = open(\"llama_stack_server.log\", \"w\")\n",
    "    process = subprocess.Popen(\n",
    "        \"uv run --with llama-stack llama stack run starter --image-type venv\",\n",
    "        shell=True,\n",
    "        stdout=log_file,\n",
    "        stderr=log_file,\n",
    "        text=True,\n",
    "    )\n",
    "\n",
    "    print(f\"Starting Llama Stack server with PID: {process.pid}\")\n",
    "    return process\n",
    "\n",
    "\n",
    "def wait_for_server_to_start():\n",
    "    import requests\n",
    "    from requests.exceptions import ConnectionError\n",
    "\n",
    "    url = \"http://0.0.0.0:8321/v1/health\"\n",
    "    max_retries = 30\n",
    "    retry_interval = 2\n",
    "\n",
    "    print(\"Waiting for server to start\", end=\"\")\n",
    "    for _ in range(max_retries):\n",
    "        try:\n",
    "            response = requests.get(url)\n",
    "            if response.status_code == 200:\n",
    "                print(\"\\nServer is ready!\")\n",
    "                return True\n",
    "        except ConnectionError:\n",
    "            print(\".\", end=\"\", flush=True)\n",
    "            time.sleep(retry_interval)\n",
    "\n",
    "    print(\"\\nServer failed to start after\", max_retries * retry_interval, \"seconds\")\n",
    "    return False\n",
    "\n",
    "\n",
    "# use this helper if needed to kill the server\n",
    "def kill_llama_stack_server():\n",
    "    # Kill any existing llama stack server processes\n",
    "    os.system(\"ps aux | grep -v grep | grep llama_stack.core.server.server | awk '{print $2}' | xargs kill -9\")\n",
    "server_process = run_llama_stack_server_background()\n",
    "assert wait_for_server_to_start()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0j5hag7l9x89",
   "metadata": {
    "id": "0j5hag7l9x89"
   },
   "source": [
    "### 3. Initialize LlamaStack Client\n",
    "\n",
    "Create a client connection to the LlamaStack server with API key for Together provider.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "ab4eff97-4565-4c73-b1b3-0020a4c7e2a5",
   "metadata": {
    "id": "ab4eff97-4565-4c73-b1b3-0020a4c7e2a5"
   },
   "outputs": [],
   "source": [
    "from llama_stack_client import LlamaStackClient\n",
    "\n",
    "client = LlamaStackClient(\n",
    "    base_url=\"http://0.0.0.0:8321\",\n",
    "    provider_data={\"together_api_key\": os.environ[\"TOGETHER_API_KEY\"]},\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "vwhexjy1e8o",
   "metadata": {
    "id": "vwhexjy1e8o"
   },
   "source": [
    "#### Explore Available Models and Safety Features\n",
    "\n",
    "Check what models and safety shields are available through your LlamaStack instance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "880443ef-ac3c-48b1-a80a-7dab5b25ac61",
   "metadata": {
    "id": "880443ef-ac3c-48b1-a80a-7dab5b25ac61",
    "outputId": "0604e931-e280-44db-bce5-38373c0cbea8",
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:httpx:HTTP Request: GET http://0.0.0.0:8321/v1/models \"HTTP/1.1 200 OK\"\n",
      "INFO:httpx:HTTP Request: GET http://0.0.0.0:8321/v1/shields \"HTTP/1.1 200 OK\"\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Available models:\n",
      "- fireworks/accounts/fireworks/models/llama-v3p1-8b-instruct\n",
      "- fireworks/accounts/fireworks/models/llama-v3p1-70b-instruct\n",
      "- fireworks/accounts/fireworks/models/llama-v3p1-405b-instruct\n",
      "- fireworks/accounts/fireworks/models/llama-v3p2-3b-instruct\n",
      "- fireworks/accounts/fireworks/models/llama-v3p2-11b-vision-instruct\n",
      "- fireworks/accounts/fireworks/models/llama-v3p2-90b-vision-instruct\n",
      "- fireworks/accounts/fireworks/models/llama-v3p3-70b-instruct\n",
      "- fireworks/accounts/fireworks/models/llama4-scout-instruct-basic\n",
      "- fireworks/accounts/fireworks/models/llama4-maverick-instruct-basic\n",
      "- fireworks/nomic-ai/nomic-embed-text-v1.5\n",
      "- fireworks/accounts/fireworks/models/llama-guard-3-8b\n",
      "- fireworks/accounts/fireworks/models/llama-guard-3-11b-vision\n",
      "- together/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo\n",
      "- together/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo\n",
      "- together/meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo\n",
      "- together/meta-llama/Llama-3.2-3B-Instruct-Turbo\n",
      "- together/meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo\n",
      "- together/meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo\n",
      "- together/meta-llama/Llama-3.3-70B-Instruct-Turbo\n",
      "- together/togethercomputer/m2-bert-80M-8k-retrieval\n",
      "- together/togethercomputer/m2-bert-80M-32k-retrieval\n",
      "- together/meta-llama/Llama-4-Scout-17B-16E-Instruct\n",
      "- together/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8\n",
      "- together/meta-llama/Llama-Guard-3-8B\n",
      "- together/meta-llama/Llama-Guard-3-11B-Vision-Turbo\n",
      "- bedrock/meta.llama3-1-8b-instruct-v1:0\n",
      "- bedrock/meta.llama3-1-70b-instruct-v1:0\n",
      "- bedrock/meta.llama3-1-405b-instruct-v1:0\n",
      "- openai/gpt-3.5-turbo-0125\n",
      "- openai/gpt-3.5-turbo\n",
      "- openai/gpt-3.5-turbo-instruct\n",
      "- openai/gpt-4\n",
      "- openai/gpt-4-turbo\n",
      "- openai/gpt-4o\n",
      "- openai/gpt-4o-2024-08-06\n",
      "- openai/gpt-4o-mini\n",
      "- openai/gpt-4o-audio-preview\n",
      "- openai/chatgpt-4o-latest\n",
      "- openai/o1\n",
      "- openai/o1-mini\n",
      "- openai/o3-mini\n",
      "- openai/o4-mini\n",
      "- openai/text-embedding-3-small\n",
      "- openai/text-embedding-3-large\n",
      "- anthropic/claude-3-5-sonnet-latest\n",
      "- anthropic/claude-3-7-sonnet-latest\n",
      "- anthropic/claude-3-5-haiku-latest\n",
      "- anthropic/voyage-3\n",
      "- anthropic/voyage-3-lite\n",
      "- anthropic/voyage-code-3\n",
      "- gemini/gemini-1.5-flash\n",
      "- gemini/gemini-1.5-pro\n",
      "- gemini/gemini-2.0-flash\n",
      "- gemini/gemini-2.0-flash-lite\n",
      "- gemini/gemini-2.5-flash\n",
      "- gemini/gemini-2.5-flash-lite\n",
      "- gemini/gemini-2.5-pro\n",
      "- gemini/text-embedding-004\n",
      "- groq/llama3-8b-8192\n",
      "- groq/llama-3.1-8b-instant\n",
      "- groq/llama3-70b-8192\n",
      "- groq/llama-3.3-70b-versatile\n",
      "- groq/llama-3.2-3b-preview\n",
      "- groq/meta-llama/llama-4-scout-17b-16e-instruct\n",
      "- groq/meta-llama/llama-4-maverick-17b-128e-instruct\n",
      "- sambanova/Meta-Llama-3.1-8B-Instruct\n",
      "- sambanova/Meta-Llama-3.3-70B-Instruct\n",
      "- sambanova/Llama-4-Maverick-17B-128E-Instruct\n",
      "- sentence-transformers/all-MiniLM-L6-v2\n",
      "----\n",
      "Available shields (safety models):\n",
      "----\n"
     ]
    }
   ],
   "source": [
    "print(\"Available models:\")\n",
    "for m in client.models.list():\n",
    "    print(f\"- {m.identifier}\")\n",
    "\n",
    "print(\"----\")\n",
    "print(\"Available shields (safety models):\")\n",
    "for s in client.shields.list():\n",
    "    print(s.identifier)\n",
    "print(\"----\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b0f28603-3207-4157-b731-638d93cd82b5",
   "metadata": {
    "id": "b0f28603-3207-4157-b731-638d93cd82b5"
   },
   "source": [
    "### 4. Vector Database Setup\n",
    "\n",
    "#### Register a Vector Database\n",
    "\n",
    "Create a FAISS vector database for storing document embeddings:\n",
    "\n",
    "- **Vector DB ID**: Unique identifier for the database\n",
    "- **Provider**: FAISS (Facebook AI Similarity Search)\n",
    "- **Embedding Model**: Sentence Transformers model for text embeddings\n",
    "- **Dimensions**: 384-dimensional embeddings"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "0f241d81-19a7-451f-ac4e-2869a29300d1",
   "metadata": {
    "id": "0f241d81-19a7-451f-ac4e-2869a29300d1",
    "outputId": "b2512715-a9e1-431e-88d4-378165a8ff8b"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/vector-dbs \"HTTP/1.1 200 OK\"\n",
      "INFO:httpx:HTTP Request: GET http://0.0.0.0:8321/v1/vector-dbs \"HTTP/1.1 200 OK\"\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Registered new vector DB: VectorDBRegisterResponse(embedding_dimension=384, embedding_model='sentence-transformers/all-MiniLM-L6-v2', identifier='acme_docs', provider_id='faiss', type='vector_db', provider_resource_id='acme_docs_v2', vector_db_name=None, owner=None, source='via_register_api')\n",
      "Existing vector DBs: [VectorDBListResponseItem(embedding_dimension=384, embedding_model='sentence-transformers/all-MiniLM-L6-v2', identifier='acme_docs', provider_id='faiss', type='vector_db', provider_resource_id='acme_docs_v2', vector_db_name=None)]\n"
     ]
    }
   ],
   "source": [
    "# Register a new clean vector database\n",
    "vector_db = client.vector_dbs.register(\n",
    "    vector_db_id=\"acme_docs\",  # Use a new unique name\n",
    "    provider_id=\"faiss\",\n",
    "    provider_vector_db_id=\"acme_docs_v2\",\n",
    "    embedding_model=\"sentence-transformers/all-MiniLM-L6-v2\",\n",
    "    embedding_dimension=384,\n",
    ")\n",
    "print(\"Registered new vector DB:\", vector_db)\n",
    "\n",
    "# List all registered vector databases\n",
    "dbs = client.vector_dbs.list()\n",
    "print(\"Existing vector DBs:\", dbs)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "pcgjqzfr3eo",
   "metadata": {
    "id": "pcgjqzfr3eo"
   },
   "source": [
    "#### Prepare Sample Documents\n",
    "\n",
    "Create LLAMA Stack Chunks for FAISS vector store"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "5a0a6619-c9fb-4938-8ff3-f84304eed91e",
   "metadata": {
    "id": "5a0a6619-c9fb-4938-8ff3-f84304eed91e"
   },
   "outputs": [],
   "source": [
    "from llama_stack_client.types.vector_io_insert_params import Chunk\n",
    "\n",
    "docs = [\n",
    "    (\"Acme ships globally in 3-5 business days.\", {\"title\": \"Shipping Policy\"}),\n",
    "    (\"Returns are accepted within 30 days of purchase.\", {\"title\": \"Returns Policy\"}),\n",
    "    (\"Support is available 24/7 via chat and email.\", {\"title\": \"Support\"}),\n",
    "]\n",
    "\n",
    "# Convert to Chunk objects\n",
    "chunks = []\n",
    "for _, (content, metadata) in enumerate(docs):\n",
    "    # Transform metadata to required format with document_id from title\n",
    "    metadata = {\"document_id\": metadata[\"title\"]}\n",
    "    chunk = Chunk(\n",
    "        content=content,  # Required[InterleavedContent]\n",
    "        metadata=metadata,  # Required[Dict]\n",
    "    )\n",
    "    chunks.append(chunk)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6bg3sm2ko5g",
   "metadata": {
    "id": "6bg3sm2ko5g"
   },
   "source": [
    "#### Insert Documents into Vector Database\n",
    "\n",
    "Store the prepared documents in the FAISS vector database. This process:\n",
    "1. Generates embeddings for each document\n",
    "2. Stores embeddings with metadata\n",
    "3. Enables semantic search capabilities"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "0e8740d8-b809-44b9-915f-1e0200e3c3f1",
   "metadata": {
    "id": "0e8740d8-b809-44b9-915f-1e0200e3c3f1",
    "outputId": "ebc275ae-54dd-4566-edb0-f3a067bfcf03"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/vector-io/insert \"HTTP/1.1 200 OK\"\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Documents inserted: None\n"
     ]
    }
   ],
   "source": [
    "# Insert chunks into FAISS vector store\n",
    "\n",
    "response = client.vector_io.insert(vector_db_id=\"acme_docs\", chunks=chunks)\n",
    "print(\"Documents inserted:\", response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9061tmi1zpq",
   "metadata": {
    "id": "9061tmi1zpq"
   },
   "source": [
    "#### Test Vector Search\n",
    "\n",
    "Query the vector database to verify it's working correctly. This performs semantic search to find relevant documents based on the query."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "4a5e010c-eeeb-4020-a957-74d6d1cba342",
   "metadata": {
    "id": "4a5e010c-eeeb-4020-a957-74d6d1cba342",
    "outputId": "14e1fde5-38ae-4532-b53b-4a2970c09352"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/vector-io/query \"HTTP/1.1 200 OK\"\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "metadata : {'document_id': 'Shipping Policy'}\n",
      "content : Acme ships globally in 3-5 business days.\n",
      "metadata : {'document_id': 'Returns Policy'}\n",
      "content : Returns are accepted within 30 days of purchase.\n",
      "metadata : {'document_id': 'Support'}\n",
      "content : Support is available 24/7 via chat and email.\n"
     ]
    }
   ],
   "source": [
    "# Query chunks from FAISS vector store\n",
    "\n",
    "query_chunk_response = client.vector_io.query(\n",
    "    vector_db_id=\"acme_docs\",\n",
    "    query=\"How long does Acme take to ship orders?\",\n",
    ")\n",
    "for chunk in query_chunk_response.chunks:\n",
    "    print(\"metadata\", \":\", chunk.metadata)\n",
    "    print(\"content\", \":\", chunk.content)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "usne6mbspms",
   "metadata": {
    "id": "usne6mbspms"
   },
   "source": [
    "### 5. CrewAI Integration\n",
    "\n",
    "#### Configure CrewAI with LlamaStack\n",
    "\n",
    "Set up CrewAI to use LlamaStack's OpenAI-compatible API:\n",
    "\n",
    "- **Base URL**: Points to LlamaStack's OpenAI endpoint\n",
    "- **Headers**: Include Together AI API key for model access\n",
    "- **Model**: Use Meta Llama 3.3 70B model via Together AI"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "c378bd10-09c2-417c-bdfc-1e0a2dd19084",
   "metadata": {
    "id": "c378bd10-09c2-417c-bdfc-1e0a2dd19084",
    "outputId": "f7db1a39-097e-46db-ddef-e309930a4564"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:httpx:HTTP Request: GET https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json \"HTTP/1.1 200 OK\"\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "from crewai.llm import LLM\n",
    "\n",
    "# Point LLM class to Llamastack Server\n",
    "\n",
    "llamastack_llm = LLM(\n",
    "    model=\"openai/together/meta-llama/Llama-3.3-70B-Instruct-Turbo\", # it's an openai-api compatible model\n",
    "    base_url=\"http://localhost:8321/v1/openai/v1\",\n",
    "    api_key = os.getenv(\"OPENAI_API_KEY\", \"dummy\"),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5a4ddpcuk3l",
   "metadata": {
    "id": "5a4ddpcuk3l"
   },
   "source": [
    "#### Test LLM Connection\n",
    "\n",
    "Verify that CrewAI LLM can successfully communicate with the LlamaStack server."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "f88ffb5a-657b-4916-9375-c6ddc156c25e",
   "metadata": {
    "id": "f88ffb5a-657b-4916-9375-c6ddc156c25e",
    "outputId": "f48443dc-19d2-440e-a24a-4a8fb8ab4725"
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[92m10:30:35 - LiteLLM:INFO\u001b[0m: utils.py:3258 - \n",
      "LiteLLM completion() model= together/meta-llama/Llama-3.3-70B-Instruct-Turbo; provider = openai\n",
      "INFO:LiteLLM:\n",
      "LiteLLM completion() model= together/meta-llama/Llama-3.3-70B-Instruct-Turbo; provider = openai\n",
      "INFO:httpx:HTTP Request: POST http://localhost:8321/v1/openai/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
      "\u001b[92m10:30:41 - LiteLLM:INFO\u001b[0m: utils.py:1260 - Wrapper: Completed Call, calling success_handler\n",
      "INFO:LiteLLM:Wrapper: Completed Call, calling success_handler\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "\"In the Andes' gentle breeze, a llama's soft eyes gaze with peaceful ease, its fur a warm and fuzzy tease. With steps both gentle and serene, the llama roams, a symbol of calm, its beauty truly unseen.\""
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Test llm with simple message\n",
    "messages = [\n",
    "    {\"role\": \"system\", \"content\": \"You are a friendly assistant.\"},\n",
    "    {\"role\": \"user\", \"content\": \"Write a two-sentence poem about llama.\"},\n",
    "]\n",
    "llamastack_llm.call(messages)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5f478686-aa7b-4631-a737-c2ea3c65a7c8",
   "metadata": {
    "id": "5f478686-aa7b-4631-a737-c2ea3c65a7c8"
   },
   "source": [
    "#### Create CrewAI Custom Tool\n",
    "\n",
    "Define a custom CrewAI tool, `LlamaStackRAGTool`, to encapsulate the logic for querying the LlamaStack vector database. This tool will be used by the CrewAI agent to perform retrieval during the RAG process.\n",
    "\n",
    "-   **Input Schema**: Defines the expected input parameters for the tool, such as the user query, the vector database ID, and optional parameters like `top_k` and `score_threshold`.\n",
    "-   **Tool Logic**: Implements the `_run` method, which takes the user query and vector database ID, calls the LlamaStack client's `vector_io.query` method, and formats the retrieved documents into a human-readable string for the LLM to use as context."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "08de540f-ed47-405a-a9c5-16505f4c88c8",
   "metadata": {
    "id": "08de540f-ed47-405a-a9c5-16505f4c88c8"
   },
   "outputs": [],
   "source": [
    "from crewai.tools import BaseTool\n",
    "from typing import Any, List, Optional, Type\n",
    "from pydantic import BaseModel, Field\n",
    "\n",
    "# ---------- 1. Input schema ----------\n",
    "class RAGToolInput(BaseModel):\n",
    "    \"\"\"Input schema for LlamaStackRAGTool.\"\"\"\n",
    "    query: str = Field(..., description=\"The user query for RAG search\")\n",
    "    vector_db_id: str = Field(...,\n",
    "        description=\"ID of the vector database to search inside the Llama-Stack server\",\n",
    "    )\n",
    "    top_k: Optional[int] = Field(\n",
    "        default=5,\n",
    "        description=\"How many documents to return\",\n",
    "    )\n",
    "    score_threshold: Optional[float] = Field(\n",
    "        default=None,\n",
    "        description=\"Optional similarity score cut-off (0-1).\",\n",
    "    )\n",
    "\n",
    "# ---------- 2. The tool ----------\n",
    "class LlamaStackRAGTool(BaseTool):\n",
    "    name: str = \"Llama Stack RAG tool\"\n",
    "    description: str = (\n",
    "        \"This tool calls a Llama-Stack endpoint for retrieval-augmented generation. \"\n",
    "        \"It takes a natural-language query and returns the most relevant documents.\"\n",
    "    )\n",
    "    args_schema: Type[BaseModel] = RAGToolInput\n",
    "    client: Any\n",
    "    vector_db_id: str = \"\"\n",
    "    top_k: int = 5\n",
    "    score_threshold: Optional[float] = None\n",
    "\n",
    "    def _run(self, **kwargs: Any) -> str:\n",
    "        # 1. Resolve parameters (use instance defaults when not supplied)\n",
    "        print(kwargs)\n",
    "        query: str = kwargs.get(\"query\")                    # Required – schema enforces presence\n",
    "        vector_db_id: str = kwargs.get(\"vector_db_id\", self.vector_db_id)\n",
    "        top_k: int = kwargs.get(\"top_k\", self.top_k)\n",
    "        score_threshold: float | None = kwargs.get(\n",
    "            \"score_threshold\", self.score_threshold\n",
    "        )\n",
    "        if vector_db_id == \"\":\n",
    "            print('vector_db_id is empty, please specify which vector_db to search')\n",
    "            return \"No documents found.\"\n",
    "        # 2. Issue request to Llama-Stack\n",
    "        response: List[dict] = self.client.vector_io.query(\n",
    "            vector_db_id=vector_db_id,\n",
    "            query=query,\n",
    "            params={\"max_chunks\": top_k, \"score_threshold\": score_threshold}\n",
    "        )\n",
    "\n",
    "        # 3. Massage results into a single human-readable string\n",
    "        if not response:\n",
    "            return \"No documents found.\"\n",
    "\n",
    "        docs: List[str] = []\n",
    "        for metadata,content in response:\n",
    "            docs.append(f\"metadata: {metadata}, content: {content}\")\n",
    "        return \"\\n\".join(docs)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0xh0jg6a0l4a",
   "metadata": {
    "id": "0xh0jg6a0l4a"
   },
   "source": [
    "### 6. Building the RAG tool\n",
    "\n",
    "#### Create a Complete RAG Pipeline\n",
    "\n",
    "Construct a CrewAI pipeline that orchestrates the RAG process. This pipeline includes:\n",
    "\n",
    "1.  **Agent Definition**: Defining a CrewAI agent with a specific role (`RAG assistant`), goal, backstory, and the LlamaStack LLM and the custom RAG tool.\n",
    "2.  **Task Definition**: Defining a CrewAI task for the agent to perform. The task description includes placeholders for the user query and vector database ID, which will be provided during execution. The task's expected output is an answer to the question based on the retrieved context.\n",
    "3.  **Crew Definition**: Creating a CrewAI `Crew` object with the defined task and agent. This crew represents the complete RAG pipeline.\n",
    "\n",
    "**CrewAI workflow**:\n",
    "`User Query → CrewAI Task → Agent invokes LlamaStackRAGTool → LlamaStack Vector Search → Retrieved Context → Agent uses Context + Question → LLM Generation → Final Response`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "9684427d-dcc7-4544-9af5-8b110d014c42",
   "metadata": {
    "id": "9684427d-dcc7-4544-9af5-8b110d014c42"
   },
   "outputs": [],
   "source": [
    "from crewai import Agent, Crew, Task, Process\n",
    "\n",
    "# ---- 3. Define the agent -----------------------------------------\n",
    "agent = Agent(\n",
    "    role=\"RAG assistant\",\n",
    "    goal=\"Answer user's question with provided context\",\n",
    "    backstory=\"You are an experienced search assistant specializing in finding relevant information from documentation and vector_db to answer user questions accurately.\",\n",
    "    allow_delegation=False,\n",
    "    llm=llamastack_llm,\n",
    "    tools=[LlamaStackRAGTool(client=client)])\n",
    "# ---- 4. Wrap everything in a Crew task ---------------------------\n",
    "task = Task(\n",
    "    description=\"Answer the following questions: {query}, using the RAG_tool to search the provided vector_db_id {vector_db_id} if needed\",\n",
    "    expected_output=\"An answer to the question with provided context\",\n",
    "    agent=agent,\n",
    ")\n",
    "crew = Crew(tasks=[task], verbose=True)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0onu6rhphlra",
   "metadata": {
    "id": "0onu6rhphlra"
   },
   "source": [
    "### 7. Testing the RAG System\n",
    "\n",
    "#### Example 1: Shipping Query"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "03322188-9509-446a-a4a8-ce3bb83ec87c",
   "metadata": {
    "colab": {
     "referenced_widgets": [
      "39eb50b3c96244cf9c82043c0a359d8a"
     ]
    },
    "id": "03322188-9509-446a-a4a8-ce3bb83ec87c",
    "outputId": "ddc3a70d-c0f3-484f-8469-9362e44d8831"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080\">╭──────────────────────────────────────────── Crew Execution Started ─────────────────────────────────────────────╮</span>\n",
       "<span style=\"color: #008080; text-decoration-color: #008080\">│</span>                                                                                                                 <span style=\"color: #008080; text-decoration-color: #008080\">│</span>\n",
       "<span style=\"color: #008080; text-decoration-color: #008080\">│</span>  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">Crew Execution Started</span>                                                                                         <span style=\"color: #008080; text-decoration-color: #008080\">│</span>\n",
       "<span style=\"color: #008080; text-decoration-color: #008080\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">Name: </span><span style=\"color: #008080; text-decoration-color: #008080\">crew</span>                                                                                                     <span style=\"color: #008080; text-decoration-color: #008080\">│</span>\n",
       "<span style=\"color: #008080; text-decoration-color: #008080\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">ID: </span><span style=\"color: #008080; text-decoration-color: #008080\">146be674-7521-4db6-bb70-287b22aa21aa</span>                                                                       <span style=\"color: #008080; text-decoration-color: #008080\">│</span>\n",
       "<span style=\"color: #008080; text-decoration-color: #008080\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">Tool Args: </span>                                                                                                    <span style=\"color: #008080; text-decoration-color: #008080\">│</span>\n",
       "<span style=\"color: #008080; text-decoration-color: #008080\">│</span>                                                                                                                 <span style=\"color: #008080; text-decoration-color: #008080\">│</span>\n",
       "<span style=\"color: #008080; text-decoration-color: #008080\">│</span>                                                                                                                 <span style=\"color: #008080; text-decoration-color: #008080\">│</span>\n",
       "<span style=\"color: #008080; text-decoration-color: #008080\">╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[36m╭─\u001b[0m\u001b[36m───────────────────────────────────────────\u001b[0m\u001b[36m Crew Execution Started \u001b[0m\u001b[36m────────────────────────────────────────────\u001b[0m\u001b[36m─╮\u001b[0m\n",
       "\u001b[36m│\u001b[0m                                                                                                                 \u001b[36m│\u001b[0m\n",
       "\u001b[36m│\u001b[0m  \u001b[1;36mCrew Execution Started\u001b[0m                                                                                         \u001b[36m│\u001b[0m\n",
       "\u001b[36m│\u001b[0m  \u001b[37mName: \u001b[0m\u001b[36mcrew\u001b[0m                                                                                                     \u001b[36m│\u001b[0m\n",
       "\u001b[36m│\u001b[0m  \u001b[37mID: \u001b[0m\u001b[36m146be674-7521-4db6-bb70-287b22aa21aa\u001b[0m                                                                       \u001b[36m│\u001b[0m\n",
       "\u001b[36m│\u001b[0m  \u001b[37mTool Args: \u001b[0m                                                                                                    \u001b[36m│\u001b[0m\n",
       "\u001b[36m│\u001b[0m                                                                                                                 \u001b[36m│\u001b[0m\n",
       "\u001b[36m│\u001b[0m                                                                                                                 \u001b[36m│\u001b[0m\n",
       "\u001b[36m╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "0fd1807480b443948be37ef922f4ad0f",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[92m10:31:29 - LiteLLM:INFO\u001b[0m: utils.py:3258 - \n",
      "LiteLLM completion() model= together/meta-llama/Llama-3.3-70B-Instruct-Turbo; provider = openai\n",
      "INFO:LiteLLM:\n",
      "LiteLLM completion() model= together/meta-llama/Llama-3.3-70B-Instruct-Turbo; provider = openai\n",
      "INFO:httpx:HTTP Request: POST http://localhost:8321/v1/openai/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
      "\u001b[92m10:31:30 - LiteLLM:INFO\u001b[0m: utils.py:1260 - Wrapper: Completed Call, calling success_handler\n",
      "INFO:LiteLLM:Wrapper: Completed Call, calling success_handler\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">{'query': 'How long does shipping take?', 'vector_db_id': 'acme_docs', 'top_k': 5, 'score_threshold': 0.5}\n",
       "</pre>\n"
      ],
      "text/plain": [
       "{'query': 'How long does shipping take?', 'vector_db_id': 'acme_docs', 'top_k': 5, 'score_threshold': 0.5}\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/vector-io/query \"HTTP/1.1 200 OK\"\n",
      "\u001b[92m10:31:31 - LiteLLM:INFO\u001b[0m: utils.py:3258 - \n",
      "LiteLLM completion() model= together/meta-llama/Llama-3.3-70B-Instruct-Turbo; provider = openai\n",
      "INFO:LiteLLM:\n",
      "LiteLLM completion() model= together/meta-llama/Llama-3.3-70B-Instruct-Turbo; provider = openai\n",
      "INFO:httpx:HTTP Request: POST http://localhost:8321/v1/openai/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
      "\u001b[92m10:31:33 - LiteLLM:INFO\u001b[0m: utils.py:1260 - Wrapper: Completed Call, calling success_handler\n",
      "INFO:LiteLLM:Wrapper: Completed Call, calling success_handler\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008000; text-decoration-color: #008000\">╭──────────────────────────────────────────────── Task Completion ────────────────────────────────────────────────╮</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>                                                                                                                 <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #008000; text-decoration-color: #008000; font-weight: bold\">Task Completed</span>                                                                                                 <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">Name: </span><span style=\"color: #008000; text-decoration-color: #008000\">b1362fb4-fe9b-479d-a01d-18bf38406780</span>                                                                     <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">Agent: </span><span style=\"color: #008000; text-decoration-color: #008000\">RAG assistant</span>                                                                                           <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">Tool Args: </span>                                                                                                    <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>                                                                                                                 <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>                                                                                                                 <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[32m╭─\u001b[0m\u001b[32m───────────────────────────────────────────────\u001b[0m\u001b[32m Task Completion \u001b[0m\u001b[32m───────────────────────────────────────────────\u001b[0m\u001b[32m─╮\u001b[0m\n",
       "\u001b[32m│\u001b[0m                                                                                                                 \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[1;32mTask Completed\u001b[0m                                                                                                 \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[37mName: \u001b[0m\u001b[32mb1362fb4-fe9b-479d-a01d-18bf38406780\u001b[0m                                                                     \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[37mAgent: \u001b[0m\u001b[32mRAG assistant\u001b[0m                                                                                           \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[37mTool Args: \u001b[0m                                                                                                    \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m                                                                                                                 \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m                                                                                                                 \u001b[32m│\u001b[0m\n",
       "\u001b[32m╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008000; text-decoration-color: #008000\">╭──────────────────────────────────────────────── Crew Completion ────────────────────────────────────────────────╮</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>                                                                                                                 <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #008000; text-decoration-color: #008000; font-weight: bold\">Crew Execution Completed</span>                                                                                       <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">Name: </span><span style=\"color: #008000; text-decoration-color: #008000\">crew</span>                                                                                                     <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">ID: </span><span style=\"color: #008000; text-decoration-color: #008000\">146be674-7521-4db6-bb70-287b22aa21aa</span>                                                                       <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">Tool Args: </span>                                                                                                    <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">Final Output: Acme ships globally in 3-5 business days. Returns are accepted within 30 days of purchase. </span>      <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">Support is available 24/7 via chat and email.</span>                                                                  <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>                                                                                                                 <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>                                                                                                                 <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[32m╭─\u001b[0m\u001b[32m───────────────────────────────────────────────\u001b[0m\u001b[32m Crew Completion \u001b[0m\u001b[32m───────────────────────────────────────────────\u001b[0m\u001b[32m─╮\u001b[0m\n",
       "\u001b[32m│\u001b[0m                                                                                                                 \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[1;32mCrew Execution Completed\u001b[0m                                                                                       \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[37mName: \u001b[0m\u001b[32mcrew\u001b[0m                                                                                                     \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[37mID: \u001b[0m\u001b[32m146be674-7521-4db6-bb70-287b22aa21aa\u001b[0m                                                                       \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[37mTool Args: \u001b[0m                                                                                                    \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[37mFinal Output: Acme ships globally in 3-5 business days. Returns are accepted within 30 days of purchase. \u001b[0m      \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[37mSupport is available 24/7 via chat and email.\u001b[0m                                                                  \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m                                                                                                                 \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m                                                                                                                 \u001b[32m│\u001b[0m\n",
       "\u001b[32m╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "❓ How long does shipping take?\n",
      "💡 Acme ships globally in 3-5 business days. Returns are accepted within 30 days of purchase. Support is available 24/7 via chat and email.\n"
     ]
    }
   ],
   "source": [
    "query = \"How long does shipping take?\"\n",
    "response = crew.kickoff(inputs={\"query\": query,\"vector_db_id\": \"acme_docs\"})\n",
    "print(\"❓\", query)\n",
    "print(\"💡\", response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b7krhqj88ku",
   "metadata": {
    "id": "b7krhqj88ku"
   },
   "source": [
    "#### Example 2: Returns Policy Query"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "61995550-bb0b-46a8-a5d0-023207475d60",
   "metadata": {
    "colab": {
     "referenced_widgets": [
      "1d575307e41d46f7943746d4380d08bb"
     ]
    },
    "id": "61995550-bb0b-46a8-a5d0-023207475d60",
    "outputId": "a039ab06-a541-48f9-a66d-6cef17911814"
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080\">╭──────────────────────────────────────────── Crew Execution Started ─────────────────────────────────────────────╮</span>\n",
       "<span style=\"color: #008080; text-decoration-color: #008080\">│</span>                                                                                                                 <span style=\"color: #008080; text-decoration-color: #008080\">│</span>\n",
       "<span style=\"color: #008080; text-decoration-color: #008080\">│</span>  <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">Crew Execution Started</span>                                                                                         <span style=\"color: #008080; text-decoration-color: #008080\">│</span>\n",
       "<span style=\"color: #008080; text-decoration-color: #008080\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">Name: </span><span style=\"color: #008080; text-decoration-color: #008080\">crew</span>                                                                                                     <span style=\"color: #008080; text-decoration-color: #008080\">│</span>\n",
       "<span style=\"color: #008080; text-decoration-color: #008080\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">ID: </span><span style=\"color: #008080; text-decoration-color: #008080\">146be674-7521-4db6-bb70-287b22aa21aa</span>                                                                       <span style=\"color: #008080; text-decoration-color: #008080\">│</span>\n",
       "<span style=\"color: #008080; text-decoration-color: #008080\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">Tool Args: </span>                                                                                                    <span style=\"color: #008080; text-decoration-color: #008080\">│</span>\n",
       "<span style=\"color: #008080; text-decoration-color: #008080\">│</span>                                                                                                                 <span style=\"color: #008080; text-decoration-color: #008080\">│</span>\n",
       "<span style=\"color: #008080; text-decoration-color: #008080\">│</span>                                                                                                                 <span style=\"color: #008080; text-decoration-color: #008080\">│</span>\n",
       "<span style=\"color: #008080; text-decoration-color: #008080\">╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[36m╭─\u001b[0m\u001b[36m───────────────────────────────────────────\u001b[0m\u001b[36m Crew Execution Started \u001b[0m\u001b[36m────────────────────────────────────────────\u001b[0m\u001b[36m─╮\u001b[0m\n",
       "\u001b[36m│\u001b[0m                                                                                                                 \u001b[36m│\u001b[0m\n",
       "\u001b[36m│\u001b[0m  \u001b[1;36mCrew Execution Started\u001b[0m                                                                                         \u001b[36m│\u001b[0m\n",
       "\u001b[36m│\u001b[0m  \u001b[37mName: \u001b[0m\u001b[36mcrew\u001b[0m                                                                                                     \u001b[36m│\u001b[0m\n",
       "\u001b[36m│\u001b[0m  \u001b[37mID: \u001b[0m\u001b[36m146be674-7521-4db6-bb70-287b22aa21aa\u001b[0m                                                                       \u001b[36m│\u001b[0m\n",
       "\u001b[36m│\u001b[0m  \u001b[37mTool Args: \u001b[0m                                                                                                    \u001b[36m│\u001b[0m\n",
       "\u001b[36m│\u001b[0m                                                                                                                 \u001b[36m│\u001b[0m\n",
       "\u001b[36m│\u001b[0m                                                                                                                 \u001b[36m│\u001b[0m\n",
       "\u001b[36m╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "e1e2dc3abcd745fa80b32ef32b264396",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\u001b[92m10:31:38 - LiteLLM:INFO\u001b[0m: utils.py:3258 - \n",
      "LiteLLM completion() model= together/meta-llama/Llama-3.3-70B-Instruct-Turbo; provider = openai\n",
      "INFO:LiteLLM:\n",
      "LiteLLM completion() model= together/meta-llama/Llama-3.3-70B-Instruct-Turbo; provider = openai\n",
      "INFO:httpx:HTTP Request: POST http://localhost:8321/v1/openai/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
      "\u001b[92m10:31:39 - LiteLLM:INFO\u001b[0m: utils.py:1260 - Wrapper: Completed Call, calling success_handler\n",
      "INFO:LiteLLM:Wrapper: Completed Call, calling success_handler\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">{'query': 'return policy after 40 days', 'vector_db_id': 'acme_docs', 'top_k': 1, 'score_threshold': 0.5}\n",
       "</pre>\n"
      ],
      "text/plain": [
       "{'query': 'return policy after 40 days', 'vector_db_id': 'acme_docs', 'top_k': 1, 'score_threshold': 0.5}\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/vector-io/query \"HTTP/1.1 200 OK\"\n",
      "\u001b[92m10:31:39 - LiteLLM:INFO\u001b[0m: utils.py:3258 - \n",
      "LiteLLM completion() model= together/meta-llama/Llama-3.3-70B-Instruct-Turbo; provider = openai\n",
      "INFO:LiteLLM:\n",
      "LiteLLM completion() model= together/meta-llama/Llama-3.3-70B-Instruct-Turbo; provider = openai\n",
      "INFO:httpx:HTTP Request: POST http://localhost:8321/v1/openai/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
      "\u001b[92m10:31:41 - LiteLLM:INFO\u001b[0m: utils.py:1260 - Wrapper: Completed Call, calling success_handler\n",
      "INFO:LiteLLM:Wrapper: Completed Call, calling success_handler\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008000; text-decoration-color: #008000\">╭──────────────────────────────────────────────── Task Completion ────────────────────────────────────────────────╮</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>                                                                                                                 <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #008000; text-decoration-color: #008000; font-weight: bold\">Task Completed</span>                                                                                                 <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">Name: </span><span style=\"color: #008000; text-decoration-color: #008000\">b1362fb4-fe9b-479d-a01d-18bf38406780</span>                                                                     <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">Agent: </span><span style=\"color: #008000; text-decoration-color: #008000\">RAG assistant</span>                                                                                           <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">Tool Args: </span>                                                                                                    <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>                                                                                                                 <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>                                                                                                                 <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[32m╭─\u001b[0m\u001b[32m───────────────────────────────────────────────\u001b[0m\u001b[32m Task Completion \u001b[0m\u001b[32m───────────────────────────────────────────────\u001b[0m\u001b[32m─╮\u001b[0m\n",
       "\u001b[32m│\u001b[0m                                                                                                                 \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[1;32mTask Completed\u001b[0m                                                                                                 \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[37mName: \u001b[0m\u001b[32mb1362fb4-fe9b-479d-a01d-18bf38406780\u001b[0m                                                                     \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[37mAgent: \u001b[0m\u001b[32mRAG assistant\u001b[0m                                                                                           \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[37mTool Args: \u001b[0m                                                                                                    \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m                                                                                                                 \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m                                                                                                                 \u001b[32m│\u001b[0m\n",
       "\u001b[32m╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008000; text-decoration-color: #008000\">╭──────────────────────────────────────────────── Crew Completion ────────────────────────────────────────────────╮</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>                                                                                                                 <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #008000; text-decoration-color: #008000; font-weight: bold\">Crew Execution Completed</span>                                                                                       <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">Name: </span><span style=\"color: #008000; text-decoration-color: #008000\">crew</span>                                                                                                     <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">ID: </span><span style=\"color: #008000; text-decoration-color: #008000\">146be674-7521-4db6-bb70-287b22aa21aa</span>                                                                       <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">Tool Args: </span>                                                                                                    <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">Final Output: Returns are accepted within 30 days of purchase. Therefore, you cannot return a product after </span>   <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>  <span style=\"color: #c0c0c0; text-decoration-color: #c0c0c0\">40 days.</span>                                                                                                       <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>                                                                                                                 <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">│</span>                                                                                                                 <span style=\"color: #008000; text-decoration-color: #008000\">│</span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[32m╭─\u001b[0m\u001b[32m───────────────────────────────────────────────\u001b[0m\u001b[32m Crew Completion \u001b[0m\u001b[32m───────────────────────────────────────────────\u001b[0m\u001b[32m─╮\u001b[0m\n",
       "\u001b[32m│\u001b[0m                                                                                                                 \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[1;32mCrew Execution Completed\u001b[0m                                                                                       \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[37mName: \u001b[0m\u001b[32mcrew\u001b[0m                                                                                                     \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[37mID: \u001b[0m\u001b[32m146be674-7521-4db6-bb70-287b22aa21aa\u001b[0m                                                                       \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[37mTool Args: \u001b[0m                                                                                                    \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[37mFinal Output: Returns are accepted within 30 days of purchase. Therefore, you cannot return a product after \u001b[0m   \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m  \u001b[37m40 days.\u001b[0m                                                                                                       \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m                                                                                                                 \u001b[32m│\u001b[0m\n",
       "\u001b[32m│\u001b[0m                                                                                                                 \u001b[32m│\u001b[0m\n",
       "\u001b[32m╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "❓ Can I return a product after 40 days?\n",
      "💡 Returns are accepted within 30 days of purchase. Therefore, you cannot return a product after 40 days.\n"
     ]
    }
   ],
   "source": [
    "query = \"Can I return a product after 40 days?\"\n",
    "response = crew.kickoff(inputs={\"query\": query,\"vector_db_id\": \"acme_docs\"})\n",
    "print(\"❓\", query)\n",
    "print(\"💡\", response)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "h4w24fadvjs",
   "metadata": {
    "id": "h4w24fadvjs"
   },
   "source": [
    "---\n",
    "\n",
    "We have successfully built a RAG system that combines:\n",
    "\n",
    "-   **LlamaStack** for infrastructure (LLM serving + vector database)\n",
    "-   **CrewAI** for orchestration (agents, tasks, and tools)\n",
    "-   **Together AI** for high-quality language models\n",
    "\n",
    "### Key Benefits\n",
    "\n",
    "1.  **Unified Infrastructure**: A single server for LLMs and vector databases simplifies deployment and management.\n",
    "2.  **OpenAI Compatibility**: Enables easy integration with existing libraries and frameworks that support the OpenAI API standard, such as CrewAI.\n",
    "3.  **Multi-Provider Support**: Offers the flexibility to switch between different LLM and embedding providers without altering the core application logic.\n",
    "4.  **Production Ready**: LlamaStack includes features designed for production environments, such as built-in safety shields and monitoring capabilities.\n",
    "\n",
    "### Next Steps\n",
    "\n",
    "-   Implement more sophisticated document processing and chunking techniques.\n",
    "-   Add conversation memory within the CrewAI pipeline for multi-turn interactions.\n",
    "-   Integrate advanced safety filtering and monitoring using LlamaStack's features.\n",
    "-   Scale the system to handle larger document collections and higher query volumes.\n",
    "-   Integrate the RAG system with web frameworks like FastAPI or Streamlit to build interactive applications.\n",
    "\n",
    "---\n",
    "\n",
    "##### 🔧 Cleanup\n",
    "\n",
    "Remember to stop the LlamaStack server process when you are finished to free up resources. You can use the `kill_llama_stack_server()` helper function defined earlier in the notebook."
   ]
  }
 ],
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}