mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-07-30 07:39:38 +00:00
Delete _archive_01_Prompt_Engineering101.ipynb
This commit is contained in:
parent
bfb04cdc0f
commit
67ae3d5d1c
1 changed files with 0 additions and 312 deletions
|
@ -1,312 +0,0 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<a href=\"https://colab.research.google.com/github/meta-llama/llama-recipes/blob/main/recipes/quickstart/Prompt_Engineering_with_Llama_3.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
|
||||
"\n",
|
||||
"# Prompt Engineering with Llama 3.1\n",
|
||||
"\n",
|
||||
"Prompt engineering is using natural language to produce a desired response from a large language model (LLM).\n",
|
||||
"\n",
|
||||
"This interactive guide covers prompt engineering & best practices with Llama 3.1."
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Introduction"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Why now?\n",
|
||||
"\n",
|
||||
"[Vaswani et al. (2017)](https://arxiv.org/abs/1706.03762) introduced the world to transformer neural networks (originally for machine translation). Transformers ushered an era of generative AI with diffusion models for image creation and large language models (`LLMs`) as **programmable deep learning networks**.\n",
|
||||
"\n",
|
||||
"Programming foundational LLMs is done with natural language – it doesn't require training/tuning like ML models of the past. This has opened the door to a massive amount of innovation and a paradigm shift in how technology can be deployed. The science/art of using natural language to program language models to accomplish a task is referred to as **Prompt Engineering**."
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Prompting Techniques"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Explicit Instructions\n",
|
||||
"\n",
|
||||
"Detailed, explicit instructions produce better results than open-ended prompts:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"complete_and_print(prompt=\"Describe quantum physics in one short sentence of no more than 12 words\")\n",
|
||||
"# Returns a succinct explanation of quantum physics that mentions particles and states existing simultaneously."
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"You can think about giving explicit instructions as using rules and restrictions to how Llama 3 responds to your prompt.\n",
|
||||
"\n",
|
||||
"- Stylization\n",
|
||||
" - `Explain this to me like a topic on a children's educational network show teaching elementary students.`\n",
|
||||
" - `I'm a software engineer using large language models for summarization. Summarize the following text in under 250 words:`\n",
|
||||
" - `Give your answer like an old timey private investigator hunting down a case step by step.`\n",
|
||||
"- Formatting\n",
|
||||
" - `Use bullet points.`\n",
|
||||
" - `Return as a JSON object.`\n",
|
||||
" - `Use less technical terms and help me apply it in my work in communications.`\n",
|
||||
"- Restrictions\n",
|
||||
" - `Only use academic papers.`\n",
|
||||
" - `Never give sources older than 2020.`\n",
|
||||
" - `If you don't know the answer, say that you don't know.`\n",
|
||||
"\n",
|
||||
"Here's an example of giving explicit instructions to give more specific results by limiting the responses to recently created sources."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"complete_and_print(\"Explain the latest advances in large language models to me.\")\n",
|
||||
"# More likely to cite sources from 2017\n",
|
||||
"\n",
|
||||
"complete_and_print(\"Explain the latest advances in large language models to me. Always cite your sources. Never cite sources older than 2020.\")\n",
|
||||
"# Gives more specific advances and only cites sources from 2020"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Example Prompting using Zero- and Few-Shot Learning\n",
|
||||
"\n",
|
||||
"A shot is an example or demonstration of what type of prompt and response you expect from a large language model. This term originates from training computer vision models on photographs, where one shot was one example or instance that the model used to classify an image ([Fei-Fei et al. (2006)](http://vision.stanford.edu/documents/Fei-FeiFergusPerona2006.pdf)).\n",
|
||||
"\n",
|
||||
"#### Zero-Shot Prompting\n",
|
||||
"\n",
|
||||
"Large language models like Llama 3 are unique because they are capable of following instructions and producing responses without having previously seen an example of a task. Prompting without examples is called \"zero-shot prompting\".\n",
|
||||
"\n",
|
||||
"Let's try using Llama 3 as a sentiment detector. You may notice that output format varies - we can improve this with better prompting."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"complete_and_print(\"Text: This was the best movie I've ever seen! \\n The sentiment of the text is: \")\n",
|
||||
"# Returns positive sentiment\n",
|
||||
"\n",
|
||||
"complete_and_print(\"Text: The director was trying too hard. \\n The sentiment of the text is: \")\n",
|
||||
"# Returns negative sentiment"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"\n",
|
||||
"#### Few-Shot Prompting\n",
|
||||
"\n",
|
||||
"Adding specific examples of your desired output generally results in more accurate, consistent output. This technique is called \"few-shot prompting\".\n",
|
||||
"\n",
|
||||
"In this example, the generated response follows our desired format that offers a more nuanced sentiment classifer that gives a positive, neutral, and negative response confidence percentage.\n",
|
||||
"\n",
|
||||
"See also: [Zhao et al. (2021)](https://arxiv.org/abs/2102.09690), [Liu et al. (2021)](https://arxiv.org/abs/2101.06804), [Su et al. (2022)](https://arxiv.org/abs/2209.01975), [Rubin et al. (2022)](https://arxiv.org/abs/2112.08633).\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def sentiment(text):\n",
|
||||
" response = chat_completion(messages=[\n",
|
||||
" user(\"You are a sentiment classifier. For each message, give the percentage of positive/netural/negative.\"),\n",
|
||||
" user(\"I liked it\"),\n",
|
||||
" assistant(\"70% positive 30% neutral 0% negative\"),\n",
|
||||
" user(\"It could be better\"),\n",
|
||||
" assistant(\"0% positive 50% neutral 50% negative\"),\n",
|
||||
" user(\"It's fine\"),\n",
|
||||
" assistant(\"25% positive 50% neutral 25% negative\"),\n",
|
||||
" user(text),\n",
|
||||
" ])\n",
|
||||
" return response\n",
|
||||
"\n",
|
||||
"def print_sentiment(text):\n",
|
||||
" print(f'INPUT: {text}')\n",
|
||||
" print(sentiment(text))\n",
|
||||
"\n",
|
||||
"print_sentiment(\"I thought it was okay\")\n",
|
||||
"# More likely to return a balanced mix of positive, neutral, and negative\n",
|
||||
"print_sentiment(\"I loved it!\")\n",
|
||||
"# More likely to return 100% positive\n",
|
||||
"print_sentiment(\"Terrible service 0/10\")\n",
|
||||
"# More likely to return 100% negative"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Role Prompting\n",
|
||||
"\n",
|
||||
"Llama will often give more consistent responses when given a role ([Kong et al. (2023)](https://browse.arxiv.org/pdf/2308.07702.pdf)). Roles give context to the LLM on what type of answers are desired.\n",
|
||||
"\n",
|
||||
"Let's use Llama 3 to create a more focused, technical response for a question around the pros and cons of using PyTorch."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"complete_and_print(\"Explain the pros and cons of using PyTorch.\")\n",
|
||||
"# More likely to explain the pros and cons of PyTorch covers general areas like documentation, the PyTorch community, and mentions a steep learning curve\n",
|
||||
"\n",
|
||||
"complete_and_print(\"Your role is a machine learning expert who gives highly technical advice to senior engineers who work with complicated datasets. Explain the pros and cons of using PyTorch.\")\n",
|
||||
"# Often results in more technical benefits and drawbacks that provide more technical details on how model layers"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Chain-of-Thought\n",
|
||||
"\n",
|
||||
"Simply adding a phrase encouraging step-by-step thinking \"significantly improves the ability of large language models to perform complex reasoning\" ([Wei et al. (2022)](https://arxiv.org/abs/2201.11903)). This technique is called \"CoT\" or \"Chain-of-Thought\" prompting.\n",
|
||||
"\n",
|
||||
"Llama 3.1 now reasons step-by-step naturally without the addition of the phrase. This section remains for completeness."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"prompt = \"Who lived longer, Mozart or Elvis?\"\n",
|
||||
"\n",
|
||||
"complete_and_print(prompt)\n",
|
||||
"# Llama 2 would often give the incorrect answer of \"Mozart\"\n",
|
||||
"\n",
|
||||
"complete_and_print(f\"{prompt} Let's think through this carefully, step by step.\")\n",
|
||||
"# Gives the correct answer \"Elvis\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Self-Consistency\n",
|
||||
"\n",
|
||||
"LLMs are probablistic, so even with Chain-of-Thought, a single generation might produce incorrect results. Self-Consistency ([Wang et al. (2022)](https://arxiv.org/abs/2203.11171)) introduces enhanced accuracy by selecting the most frequent answer from multiple generations (at the cost of higher compute):"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import re\n",
|
||||
"from statistics import mode\n",
|
||||
"\n",
|
||||
"def gen_answer():\n",
|
||||
" response = completion(\n",
|
||||
" \"John found that the average of 15 numbers is 40.\"\n",
|
||||
" \"If 10 is added to each number then the mean of the numbers is?\"\n",
|
||||
" \"Report the answer surrounded by backticks (example: `123`)\",\n",
|
||||
" )\n",
|
||||
" match = re.search(r'`(\\d+)`', response)\n",
|
||||
" if match is None:\n",
|
||||
" return None\n",
|
||||
" return match.group(1)\n",
|
||||
"\n",
|
||||
"answers = [gen_answer() for i in range(5)]\n",
|
||||
"\n",
|
||||
"print(\n",
|
||||
" f\"Answers: {answers}\\n\",\n",
|
||||
" f\"Final answer: {mode(answers)}\",\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
"# Sample runs of Llama-3-70B (all correct):\n",
|
||||
"# ['60', '50', '50', '50', '50'] -> 50\n",
|
||||
"# ['50', '50', '50', '60', '50'] -> 50\n",
|
||||
"# ['50', '50', '60', '50', '50'] -> 50"
|
||||
]
|
||||
},
|
||||
{
|
||||
"attachments": {},
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Author & Contact\n",
|
||||
"\n",
|
||||
"Edited by [Dalton Flanagan](https://www.linkedin.com/in/daltonflanagan/) (dalton@meta.com) with contributions from Mohsen Agsen, Bryce Bortree, Ricardo Juan Palma Duran, Kaolin Fire, Thomas Scialom."
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"captumWidgetMessage": [],
|
||||
"dataExplorerConfig": [],
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3 (ipykernel)",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.10.14"
|
||||
},
|
||||
"last_base_url": "https://bento.edge.x2p.facebook.net/",
|
||||
"last_kernel_id": "161e2a7b-2d2b-4995-87f3-d1539860ecac",
|
||||
"last_msg_id": "4eab1242-d815b886ebe4f5b1966da982_543",
|
||||
"last_server_session_id": "4a7b41c5-ed66-4dcb-a376-22673aebb469",
|
||||
"operator_data": [],
|
||||
"outputWidgetContext": []
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 4
|
||||
}
|
Loading…
Add table
Add a link
Reference in a new issue