Docs improvement v3 (#433)

# What does this PR do?

- updated the notebooks to reflect past changes up to llama-stack 0.0.53
- updated readme to  provide accurate and up-to-date info
- improve the current zero to hero by integrating an example using
together api


## Before submitting

- [x] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [x] Ran pre-commit to handle lint / formatting issues.
- [x] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.

---------

Co-authored-by: Sanyam Bhutani <sanyambhutani@meta.com>
This commit is contained in:
Justin Lee 2024-11-22 15:43:31 -08:00 committed by GitHub
parent 97dc5b68e5
commit 9928405e2c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
17 changed files with 1410 additions and 2295 deletions

View file

@ -0,0 +1,304 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "cd96f85a",
"metadata": {},
"source": [
"# Prompt Engineering with Llama Stack\n",
"\n",
"Prompt engineering is using natural language to produce a desired response from a large language model (LLM).\n",
"\n",
"This interactive guide covers prompt engineering & best practices with Llama 3.2 and Llama Stack.\n",
"\n",
"Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html)."
]
},
{
"cell_type": "markdown",
"id": "3e1ef1c9",
"metadata": {},
"source": [
"## Few-Shot Inference for LLMs\n",
"\n",
"This guide provides instructions on how to use Llama Stacks `chat_completion` API with a few-shot learning approach to enhance text generation. Few-shot examples enable the model to recognize patterns by providing labeled prompts, allowing it to complete tasks based on minimal prior examples.\n",
"\n",
"### Overview\n",
"\n",
"Few-shot learning provides the model with multiple examples of input-output pairs. This is particularly useful for guiding the model's behavior in specific tasks, helping it understand the desired completion format and content based on a few sample interactions.\n",
"\n",
"### Implementation"
]
},
{
"cell_type": "markdown",
"id": "e065af43",
"metadata": {},
"source": [
"### 0. Configuration\n",
"Set up your connection parameters:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "df35d1e2",
"metadata": {},
"outputs": [],
"source": [
"HOST = \"localhost\" # Replace with your host\n",
"PORT = 5001 # Replace with your port\n",
"MODEL_NAME='meta-llama/Llama-3.2-3B-Instruct'"
]
},
{
"cell_type": "markdown",
"id": "a7a25a7e",
"metadata": {},
"source": [
"#### 1. Initialize the Client\n",
"\n",
"Begin by setting up the `LlamaStackClient` to connect to the inference endpoint.\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "c2a0e359",
"metadata": {},
"outputs": [],
"source": [
"from llama_stack_client import LlamaStackClient\n",
"\n",
"client = LlamaStackClient(base_url=f'http://{HOST}:{PORT}')"
]
},
{
"cell_type": "markdown",
"id": "02cdf3f6",
"metadata": {},
"source": [
"#### 2. Define Few-Shot Examples\n",
"\n",
"Construct a series of labeled `UserMessage` and `CompletionMessage` instances to demonstrate the task to the model. Each `UserMessage` represents an input prompt, and each `CompletionMessage` is the desired output. The model uses these examples to infer the appropriate response patterns.\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "da140b33",
"metadata": {},
"outputs": [],
"source": [
"few_shot_examples = [\n",
" {\"role\": \"user\", \"content\": 'Have shorter, spear-shaped ears.'},\n",
" {\n",
" \"role\": \"assistant\",\n",
" \"content\": \"That's Alpaca!\",\n",
" \"stop_reason\": 'end_of_message',\n",
" \"tool_calls\": []\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": 'Known for their calm nature and used as pack animals in mountainous regions.'\n",
" },\n",
" {\n",
" \"role\": \"assistant\",\n",
" \"content\": \"That's Llama!\",\n",
" \"stop_reason\": 'end_of_message',\n",
" \"tool_calls\": []\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": 'Has a straight, slender neck and is smaller in size compared to its relative.'\n",
" },\n",
" {\n",
" \"role\": \"assistant\",\n",
" \"content\": \"That's Alpaca!\",\n",
" \"stop_reason\": 'end_of_message',\n",
" \"tool_calls\": []\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": 'Generally taller and more robust, commonly seen as guard animals.'\n",
" }\n",
"]"
]
},
{
"cell_type": "markdown",
"id": "6eece9cc",
"metadata": {},
"source": [
"#### Note\n",
"- **Few-Shot Examples**: These examples show the model the correct responses for specific prompts.\n",
"- **CompletionMessage**: This defines the model's expected completion for each prompt.\n"
]
},
{
"cell_type": "markdown",
"id": "5a0de6c7",
"metadata": {},
"source": [
"#### 3. Invoke `chat_completion` with Few-Shot Examples\n",
"\n",
"Use the few-shot examples as the message input for `chat_completion`. The model will use the examples to generate contextually appropriate responses, allowing it to infer and complete new queries in a similar format.\n"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "8b321089",
"metadata": {},
"outputs": [],
"source": [
"response = client.inference.chat_completion(\n",
" messages=few_shot_examples, model_id=MODEL_NAME\n",
")"
]
},
{
"cell_type": "markdown",
"id": "063265d2",
"metadata": {},
"source": [
"#### 4. Display the Models Response\n",
"\n",
"The `completion_message` contains the assistants generated content based on the few-shot examples provided. Output this content to see the model's response directly in the console.\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "4ac1ac3e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[36m> Response: That sounds like a Donkey or an Ass (also known as a Burro)!\u001b[0m\n"
]
}
],
"source": [
"from termcolor import cprint\n",
"\n",
"cprint(f'> Response: {response.completion_message.content}', 'cyan')"
]
},
{
"cell_type": "markdown",
"id": "d936ab59",
"metadata": {},
"source": [
"### Complete code\n",
"Summing it up, here's the code for few-shot implementation with llama-stack:\n"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "524189bd",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[36m> Response: You're thinking of a Llama again!\n",
"\n",
"Is that correct?\u001b[0m\n"
]
}
],
"source": [
"from llama_stack_client import LlamaStackClient\n",
"from llama_stack_client.types import CompletionMessage, UserMessage\n",
"from termcolor import cprint\n",
"\n",
"client = LlamaStackClient(base_url=f'http://{HOST}:{PORT}')\n",
"\n",
"response = client.inference.chat_completion(\n",
" messages=[\n",
" {\"role\": \"user\", \"content\": 'Have shorter, spear-shaped ears.'},\n",
" {\n",
" \"role\": \"assistant\",\n",
" \"content\": \"That's Alpaca!\",\n",
" \"stop_reason\": 'end_of_message',\n",
" \"tool_calls\": []\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": 'Known for their calm nature and used as pack animals in mountainous regions.'\n",
" },\n",
" {\n",
" \"role\": \"assistant\",\n",
" \"content\": \"That's Llama!\",\n",
" \"stop_reason\": 'end_of_message',\n",
" \"tool_calls\": []\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": 'Has a straight, slender neck and is smaller in size compared to its relative.'\n",
" },\n",
" {\n",
" \"role\": \"assistant\",\n",
" \"content\": \"That's Alpaca!\",\n",
" \"stop_reason\": 'end_of_message',\n",
" \"tool_calls\": []\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": 'Generally taller and more robust, commonly seen as guard animals.'\n",
" }\n",
"],\n",
" model_id=MODEL_NAME,\n",
")\n",
"\n",
"cprint(f'> Response: {response.completion_message.content}', 'cyan')"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "a38dcb91",
"metadata": {},
"outputs": [],
"source": [
"#fin"
]
},
{
"cell_type": "markdown",
"id": "76d053b8",
"metadata": {},
"source": [
"Thanks for checking out this notebook! \n",
"\n",
"The next one will be a guide on how to chat with images, continue to the notebook [here](./02_Image_Chat101.ipynb). Happy learning!"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "base",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.2"
}
},
"nbformat": 4,
"nbformat_minor": 5
}