llama-stack/docs/zero_to_hero_guide/02_Prompt_Engineering101.ipynb
Justin Lee 65371a5067
[Docs] Zero-to-Hero notebooks and quick start documentation (#368)
Co-authored-by: Kai Wu <kaiwu@meta.com>
Co-authored-by: Sanyam Bhutani <sanyambhutani@meta.com>
Co-authored-by: Justin Lee <justinai@fb.com>
2024-11-08 17:16:44 -08:00

299 lines
9 KiB
Text
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"cell_type": "markdown",
"id": "d2bf5275",
"metadata": {},
"source": [
"<a href=\"https://colab.research.google.com/github/meta-llama/llama-stack/blob/main/docs/zero_to_hero_guide/02_Prompt_Engineering101.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"id": "cd96f85a",
"metadata": {},
"source": [
"# Prompt Engineering with Llama Stack\n",
"\n",
"Prompt engineering is using natural language to produce a desired response from a large language model (LLM).\n",
"\n",
"This interactive guide covers prompt engineering & best practices with Llama 3.2 and Llama Stack.\n",
"\n",
"Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html)."
]
},
{
"cell_type": "markdown",
"id": "3e1ef1c9",
"metadata": {},
"source": [
"## Few-Shot Inference for LLMs\n",
"\n",
"This guide provides instructions on how to use Llama Stacks `chat_completion` API with a few-shot learning approach to enhance text generation. Few-shot examples enable the model to recognize patterns by providing labeled prompts, allowing it to complete tasks based on minimal prior examples.\n",
"\n",
"### Overview\n",
"\n",
"Few-shot learning provides the model with multiple examples of input-output pairs. This is particularly useful for guiding the model's behavior in specific tasks, helping it understand the desired completion format and content based on a few sample interactions.\n",
"\n",
"### Implementation"
]
},
{
"cell_type": "markdown",
"id": "e065af43",
"metadata": {},
"source": [
"### 0. Configuration\n",
"Set up your connection parameters:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "df35d1e2",
"metadata": {},
"outputs": [],
"source": [
"HOST = \"localhost\" # Replace with your host\n",
"PORT = 5000 # Replace with your port"
]
},
{
"cell_type": "markdown",
"id": "a7a25a7e",
"metadata": {},
"source": [
"#### 1. Initialize the Client\n",
"\n",
"Begin by setting up the `LlamaStackClient` to connect to the inference endpoint.\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "c2a0e359",
"metadata": {},
"outputs": [],
"source": [
"from llama_stack_client import LlamaStackClient\n",
"\n",
"client = LlamaStackClient(base_url=f'http://{HOST}:{PORT}')"
]
},
{
"cell_type": "markdown",
"id": "02cdf3f6",
"metadata": {},
"source": [
"#### 2. Define Few-Shot Examples\n",
"\n",
"Construct a series of labeled `UserMessage` and `CompletionMessage` instances to demonstrate the task to the model. Each `UserMessage` represents an input prompt, and each `CompletionMessage` is the desired output. The model uses these examples to infer the appropriate response patterns.\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "da140b33",
"metadata": {},
"outputs": [],
"source": [
"few_shot_examples = [\n",
" {\"role\": \"user\", \"content\": 'Have shorter, spear-shaped ears.'},\n",
" {\n",
" \"role\": \"assistant\",\n",
" \"content\": \"That's Alpaca!\",\n",
" \"stop_reason\": 'end_of_message',\n",
" \"tool_calls\": []\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": 'Known for their calm nature and used as pack animals in mountainous regions.'\n",
" },\n",
" {\n",
" \"role\": \"assistant\",\n",
" \"content\": \"That's Llama!\",\n",
" \"stop_reason\": 'end_of_message',\n",
" \"tool_calls\": []\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": 'Has a straight, slender neck and is smaller in size compared to its relative.'\n",
" },\n",
" {\n",
" \"role\": \"assistant\",\n",
" \"content\": \"That's Alpaca!\",\n",
" \"stop_reason\": 'end_of_message',\n",
" \"tool_calls\": []\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": 'Generally taller and more robust, commonly seen as guard animals.'\n",
" }\n",
"]"
]
},
{
"cell_type": "markdown",
"id": "6eece9cc",
"metadata": {},
"source": [
"#### Note\n",
"- **Few-Shot Examples**: These examples show the model the correct responses for specific prompts.\n",
"- **CompletionMessage**: This defines the model's expected completion for each prompt.\n"
]
},
{
"cell_type": "markdown",
"id": "5a0de6c7",
"metadata": {},
"source": [
"#### 3. Invoke `chat_completion` with Few-Shot Examples\n",
"\n",
"Use the few-shot examples as the message input for `chat_completion`. The model will use the examples to generate contextually appropriate responses, allowing it to infer and complete new queries in a similar format.\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "8b321089",
"metadata": {},
"outputs": [],
"source": [
"response = client.inference.chat_completion(\n",
" messages=few_shot_examples, model='Llama3.1-8B-Instruct'\n",
")"
]
},
{
"cell_type": "markdown",
"id": "063265d2",
"metadata": {},
"source": [
"#### 4. Display the Models Response\n",
"\n",
"The `completion_message` contains the assistants generated content based on the few-shot examples provided. Output this content to see the model's response directly in the console.\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "4ac1ac3e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[36m> Response: That's Llama!\u001b[0m\n"
]
}
],
"source": [
"from termcolor import cprint\n",
"\n",
"cprint(f'> Response: {response.completion_message.content}', 'cyan')"
]
},
{
"cell_type": "markdown",
"id": "d936ab59",
"metadata": {},
"source": [
"### Complete code\n",
"Summing it up, here's the code for few-shot implementation with llama-stack:\n"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "524189bd",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[36m> Response: That's Llama!\u001b[0m\n"
]
}
],
"source": [
"from llama_stack_client import LlamaStackClient\n",
"from llama_stack_client.types import CompletionMessage, UserMessage\n",
"from termcolor import cprint\n",
"\n",
"client = LlamaStackClient(base_url=f'http://{HOST}:{PORT}')\n",
"\n",
"response = client.inference.chat_completion(\n",
" messages=[\n",
" {\"role\": \"user\", \"content\": 'Have shorter, spear-shaped ears.'},\n",
" {\n",
" \"role\": \"assistant\",\n",
" \"content\": \"That's Alpaca!\",\n",
" \"stop_reason\": 'end_of_message',\n",
" \"tool_calls\": []\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": 'Known for their calm nature and used as pack animals in mountainous regions.'\n",
" },\n",
" {\n",
" \"role\": \"assistant\",\n",
" \"content\": \"That's Llama!\",\n",
" \"stop_reason\": 'end_of_message',\n",
" \"tool_calls\": []\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": 'Has a straight, slender neck and is smaller in size compared to its relative.'\n",
" },\n",
" {\n",
" \"role\": \"assistant\",\n",
" \"content\": \"That's Alpaca!\",\n",
" \"stop_reason\": 'end_of_message',\n",
" \"tool_calls\": []\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": 'Generally taller and more robust, commonly seen as guard animals.'\n",
" }\n",
"],\n",
" model='Llama3.2-11B-Vision-Instruct',\n",
")\n",
"\n",
"cprint(f'> Response: {response.completion_message.content}', 'cyan')"
]
},
{
"cell_type": "markdown",
"id": "76d053b8",
"metadata": {},
"source": [
"Thanks for checking out this notebook! \n",
"\n",
"The next one will be a guide on how to chat with images, continue to the notebook [here](./02_Image_Chat101.ipynb). Happy learning!"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.15"
}
},
"nbformat": 4,
"nbformat_minor": 5
}