llama-stack/docs/zero_to_hero_guide/02_Prompt_Engineering101.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "d2bf5275",
   "metadata": {},
   "source": [
    "<a href=\"https://colab.research.google.com/github/meta-llama/llama-stack/blob/main/docs/zero_to_hero_guide/02_Prompt_Engineering101.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cd96f85a",
   "metadata": {},
   "source": [
    "# Prompt Engineering with Llama Stack\n",
    "\n",
    "Prompt engineering is using natural language to produce a desired response from a large language model (LLM).\n",
    "\n",
    "This interactive guide covers prompt engineering & best practices with Llama 3.2 and Llama Stack.\n",
    "\n",
    "Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3e1ef1c9",
   "metadata": {},
   "source": [
    "## Few-Shot Inference for LLMs\n",
    "\n",
    "This guide provides instructions on how to use Llama Stack’s `chat_completion` API with a few-shot learning approach to enhance text generation. Few-shot examples enable the model to recognize patterns by providing labeled prompts, allowing it to complete tasks based on minimal prior examples.\n",
    "\n",
    "### Overview\n",
    "\n",
    "Few-shot learning provides the model with multiple examples of input-output pairs. This is particularly useful for guiding the model's behavior in specific tasks, helping it understand the desired completion format and content based on a few sample interactions.\n",
    "\n",
    "### Implementation"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e065af43",
   "metadata": {},
   "source": [
    "### 0. Configuration\n",
    "Set up your connection parameters:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "df35d1e2",
   "metadata": {},
   "outputs": [],
   "source": [
    "HOST = \"localhost\"  # Replace with your host\n",
    "PORT = 5000        # Replace with your port"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a7a25a7e",
   "metadata": {},
   "source": [
    "#### 1. Initialize the Client\n",
    "\n",
    "Begin by setting up the `LlamaStackClient` to connect to the inference endpoint.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "c2a0e359",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_stack_client import LlamaStackClient\n",
    "\n",
    "client = LlamaStackClient(base_url=f'http://{HOST}:{PORT}')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "02cdf3f6",
   "metadata": {},
   "source": [
    "#### 2. Define Few-Shot Examples\n",
    "\n",
    "Construct a series of labeled `UserMessage` and `CompletionMessage` instances to demonstrate the task to the model. Each `UserMessage` represents an input prompt, and each `CompletionMessage` is the desired output. The model uses these examples to infer the appropriate response patterns.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "da140b33",
   "metadata": {},
   "outputs": [],
   "source": [
    "few_shot_examples = [\n",
    "    {\"role\": \"user\", \"content\": 'Have shorter, spear-shaped ears.'},\n",
    "    {\n",
    "        \"role\": \"assistant\",\n",
    "        \"content\": \"That's Alpaca!\",\n",
    "        \"stop_reason\": 'end_of_message',\n",
    "        \"tool_calls\": []\n",
    "    },\n",
    "    {\n",
    "        \"role\": \"user\",\n",
    "        \"content\": 'Known for their calm nature and used as pack animals in mountainous regions.'\n",
    "    },\n",
    "    {\n",
    "        \"role\": \"assistant\",\n",
    "        \"content\": \"That's Llama!\",\n",
    "        \"stop_reason\": 'end_of_message',\n",
    "        \"tool_calls\": []\n",
    "    },\n",
    "    {\n",
    "        \"role\": \"user\",\n",
    "        \"content\": 'Has a straight, slender neck and is smaller in size compared to its relative.'\n",
    "    },\n",
    "    {\n",
    "        \"role\": \"assistant\",\n",
    "        \"content\": \"That's Alpaca!\",\n",
    "        \"stop_reason\": 'end_of_message',\n",
    "        \"tool_calls\": []\n",
    "    },\n",
    "    {\n",
    "        \"role\": \"user\",\n",
    "        \"content\": 'Generally taller and more robust, commonly seen as guard animals.'\n",
    "    }\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6eece9cc",
   "metadata": {},
   "source": [
    "#### Note\n",
    "- **Few-Shot Examples**: These examples show the model the correct responses for specific prompts.\n",
    "- **CompletionMessage**: This defines the model's expected completion for each prompt.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5a0de6c7",
   "metadata": {},
   "source": [
    "#### 3. Invoke `chat_completion` with Few-Shot Examples\n",
    "\n",
    "Use the few-shot examples as the message input for `chat_completion`. The model will use the examples to generate contextually appropriate responses, allowing it to infer and complete new queries in a similar format.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "8b321089",
   "metadata": {},
   "outputs": [],
   "source": [
    "response = client.inference.chat_completion(\n",
    "    messages=few_shot_examples, model='Llama3.1-8B-Instruct'\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "063265d2",
   "metadata": {},
   "source": [
    "#### 4. Display the Model’s Response\n",
    "\n",
    "The `completion_message` contains the assistant’s generated content based on the few-shot examples provided. Output this content to see the model's response directly in the console.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "4ac1ac3e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[36m> Response: That's Llama!\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "from termcolor import cprint\n",
    "\n",
    "cprint(f'> Response: {response.completion_message.content}', 'cyan')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d936ab59",
   "metadata": {},
   "source": [
    "### Complete code\n",
    "Summing it up, here's the code for few-shot implementation with llama-stack:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "524189bd",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[36m> Response: That's Llama!\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "from llama_stack_client import LlamaStackClient\n",
    "from llama_stack_client.types import CompletionMessage, UserMessage\n",
    "from termcolor import cprint\n",
    "\n",
    "client = LlamaStackClient(base_url=f'http://{HOST}:{PORT}')\n",
    "\n",
    "response = client.inference.chat_completion(\n",
    "    messages=[\n",
    "    {\"role\": \"user\", \"content\": 'Have shorter, spear-shaped ears.'},\n",
    "    {\n",
    "        \"role\": \"assistant\",\n",
    "        \"content\": \"That's Alpaca!\",\n",
    "        \"stop_reason\": 'end_of_message',\n",
    "        \"tool_calls\": []\n",
    "    },\n",
    "    {\n",
    "        \"role\": \"user\",\n",
    "        \"content\": 'Known for their calm nature and used as pack animals in mountainous regions.'\n",
    "    },\n",
    "    {\n",
    "        \"role\": \"assistant\",\n",
    "        \"content\": \"That's Llama!\",\n",
    "        \"stop_reason\": 'end_of_message',\n",
    "        \"tool_calls\": []\n",
    "    },\n",
    "    {\n",
    "        \"role\": \"user\",\n",
    "        \"content\": 'Has a straight, slender neck and is smaller in size compared to its relative.'\n",
    "    },\n",
    "    {\n",
    "        \"role\": \"assistant\",\n",
    "        \"content\": \"That's Alpaca!\",\n",
    "        \"stop_reason\": 'end_of_message',\n",
    "        \"tool_calls\": []\n",
    "    },\n",
    "    {\n",
    "        \"role\": \"user\",\n",
    "        \"content\": 'Generally taller and more robust, commonly seen as guard animals.'\n",
    "    }\n",
    "],\n",
    "    model='Llama3.2-11B-Vision-Instruct',\n",
    ")\n",
    "\n",
    "cprint(f'> Response: {response.completion_message.content}', 'cyan')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "76d053b8",
   "metadata": {},
   "source": [
    "Thanks for checking out this notebook! \n",
    "\n",
    "The next one will be a guide on how to chat with images, continue to the notebook [here](./02_Image_Chat101.ipynb). Happy learning!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.15"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}