mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-15 14:43:48 +00:00
233 lines
7.7 KiB
Text
233 lines
7.7 KiB
Text
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "cd96f85a",
|
||
"metadata": {},
|
||
"source": [
|
||
"<a href=\"https://colab.research.google.com/github/meta-llama/llama-recipes/blob/main/recipes/quickstart/Prompt_Engineering_with_Llama_3.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
|
||
"\n",
|
||
"# Prompt Engineering with Llama Stack\n",
|
||
"\n",
|
||
"Prompt engineering is using natural language to produce a desired response from a large language model (LLM).\n",
|
||
"\n",
|
||
"This interactive guide covers prompt engineering & best practices with Llama 3.1 and Llama Stack"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "3e1ef1c9",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Few-Shot Inference for LLMs\n",
|
||
"\n",
|
||
"This guide provides instructions on how to use Llama Stack’s `chat_completion` API with a few-shot learning approach to enhance text generation. Few-shot examples enable the model to recognize patterns by providing labeled prompts, allowing it to complete tasks based on minimal prior examples.\n",
|
||
"\n",
|
||
"### Overview\n",
|
||
"\n",
|
||
"Few-shot learning provides the model with multiple examples of input-output pairs. This is particularly useful for guiding the model's behavior in specific tasks, helping it understand the desired completion format and content based on a few sample interactions.\n",
|
||
"\n",
|
||
"### Implementation"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "a7a25a7e",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### 1. Initialize the Client\n",
|
||
"\n",
|
||
"Begin by setting up the `LlamaStackClient` to connect to the inference endpoint.\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "c2a0e359",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"from llama_stack_client import LlamaStackClient\n",
|
||
"\n",
|
||
"client = LlamaStackClient(base_url='http://localhost:5000')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "02cdf3f6",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### 2. Define Few-Shot Examples\n",
|
||
"\n",
|
||
"Construct a series of labeled `UserMessage` and `CompletionMessage` instances to demonstrate the task to the model. Each `UserMessage` represents an input prompt, and each `CompletionMessage` is the desired output. The model uses these examples to infer the appropriate response patterns.\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "da140b33",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"from llama_stack_client.types import CompletionMessage, UserMessage\n",
|
||
"\n",
|
||
"few_shot_examples = messages=[\n",
|
||
" UserMessage(content='Have shorter, spear-shaped ears.', role='user'),\n",
|
||
" CompletionMessage(\n",
|
||
" content=\"That's Alpaca!\",\n",
|
||
" role='assistant',\n",
|
||
" stop_reason='end_of_message',\n",
|
||
" tool_calls=[],\n",
|
||
" ),\n",
|
||
" UserMessage(\n",
|
||
" content='Known for their calm nature and used as pack animals in mountainous regions.',\n",
|
||
" role='user',\n",
|
||
" ),\n",
|
||
" CompletionMessage(\n",
|
||
" content=\"That's Llama!\",\n",
|
||
" role='assistant',\n",
|
||
" stop_reason='end_of_message',\n",
|
||
" tool_calls=[],\n",
|
||
" ),\n",
|
||
" UserMessage(\n",
|
||
" content='Has a straight, slender neck and is smaller in size compared to its relative.',\n",
|
||
" role='user',\n",
|
||
" ),\n",
|
||
" CompletionMessage(\n",
|
||
" content=\"That's Alpaca!\",\n",
|
||
" role='assistant',\n",
|
||
" stop_reason='end_of_message',\n",
|
||
" tool_calls=[],\n",
|
||
" ),\n",
|
||
" UserMessage(\n",
|
||
" content='Generally taller and more robust, commonly seen as guard animals.',\n",
|
||
" role='user',\n",
|
||
" ),\n",
|
||
"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "6eece9cc",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### Note\n",
|
||
"- **Few-Shot Examples**: These examples show the model the correct responses for specific prompts.\n",
|
||
"- **CompletionMessage**: This defines the model's expected completion for each prompt.\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "5a0de6c7",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### 3. Invoke `chat_completion` with Few-Shot Examples\n",
|
||
"\n",
|
||
"Use the few-shot examples as the message input for `chat_completion`. The model will use the examples to generate contextually appropriate responses, allowing it to infer and complete new queries in a similar format.\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "8b321089",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"response = client.inference.chat_completion(\n",
|
||
" messages=few_shot_examples, model='Llama3.2-11B-Vision-Instruct'\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "063265d2",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### 4. Display the Model’s Response\n",
|
||
"\n",
|
||
"The `completion_message` contains the assistant’s generated content based on the few-shot examples provided. Output this content to see the model's response directly in the console.\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "4ac1ac3e",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"from termcolor import cprint\n",
|
||
"\n",
|
||
"cprint(f'> Response: {response.completion_message.content}', 'cyan')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d936ab59",
|
||
"metadata": {},
|
||
"source": [
|
||
"### Complete code\n",
|
||
"Summing it up, here's the code for few-shot implementation with llama-stack:\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "524189bd",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"from llama_stack_client import LlamaStackClient\n",
|
||
"from llama_stack_client.types import CompletionMessage, UserMessage\n",
|
||
"from termcolor import cprint\n",
|
||
"\n",
|
||
"client = LlamaStackClient(base_url='http://localhost:5000')\n",
|
||
"\n",
|
||
"response = client.inference.chat_completion(\n",
|
||
" messages=[\n",
|
||
" UserMessage(content='Have shorter, spear-shaped ears.', role='user'),\n",
|
||
" CompletionMessage(\n",
|
||
" content=\"That's Alpaca!\",\n",
|
||
" role='assistant',\n",
|
||
" stop_reason='end_of_message',\n",
|
||
" tool_calls=[],\n",
|
||
" ),\n",
|
||
" UserMessage(\n",
|
||
" content='Known for their calm nature and used as pack animals in mountainous regions.',\n",
|
||
" role='user',\n",
|
||
" ),\n",
|
||
" CompletionMessage(\n",
|
||
" content=\"That's Llama!\",\n",
|
||
" role='assistant',\n",
|
||
" stop_reason='end_of_message',\n",
|
||
" tool_calls=[],\n",
|
||
" ),\n",
|
||
" UserMessage(\n",
|
||
" content='Has a straight, slender neck and is smaller in size compared to its relative.',\n",
|
||
" role='user',\n",
|
||
" ),\n",
|
||
" CompletionMessage(\n",
|
||
" content=\"That's Alpaca!\",\n",
|
||
" role='assistant',\n",
|
||
" stop_reason='end_of_message',\n",
|
||
" tool_calls=[],\n",
|
||
" ),\n",
|
||
" UserMessage(\n",
|
||
" content='Generally taller and more robust, commonly seen as guard animals.',\n",
|
||
" role='user',\n",
|
||
" ),\n",
|
||
" ],\n",
|
||
" model='Llama3.2-11B-Vision-Instruct',\n",
|
||
")\n",
|
||
"\n",
|
||
"cprint(f'> Response: {response.completion_message.content}', 'cyan')"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"language_info": {
|
||
"name": "python"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|