"Open

# Prompt Engineering with Llama Stack

Prompt engineering is using natural language to produce a desired response from a large language model (LLM).

This interactive guide covers prompt engineering & best practices with Llama 3.2 and Llama Stack.

Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html).

## Few-Shot Inference for LLMs

This guide provides instructions on how to use Llama Stack’s `chat_completion` API with a few-shot learning approach to enhance text generation. Few-shot examples enable the model to recognize patterns by providing labeled prompts, allowing it to complete tasks based on minimal prior examples.

### Overview

Few-shot learning provides the model with multiple examples of input-output pairs. This is particularly useful for guiding the model's behavior in specific tasks, helping it understand the desired completion format and content based on a few sample interactions.

### Implementation

### 0. Configuration
Set up your connection parameters:

In [1]:
HOST = "localhost" # Replace with your host
PORT = 5000 # Replace with your port

#### 1. Initialize the Client

Begin by setting up the `LlamaStackClient` to connect to the inference endpoint.


In [2]:
from llama_stack_client import LlamaStackClient

client = LlamaStackClient(base_url=f'http://{HOST}:{PORT}')

#### 2. Define Few-Shot Examples

Construct a series of labeled `UserMessage` and `CompletionMessage` instances to demonstrate the task to the model. Each `UserMessage` represents an input prompt, and each `CompletionMessage` is the desired output. The model uses these examples to infer the appropriate response patterns.


In [3]:
few_shot_examples = [
 {"role": "user", "content": 'Have shorter, spear-shaped ears.'},
 {
 "role": "assistant",
 "content": "That's Alpaca!",
 "stop_reason": 'end_of_message',
 "tool_calls": []
 },
 {
 "role": "user",
 "content": 'Known for their calm nature and used as pack animals in mountainous regions.'
 },
 {
 "role": "assistant",
 "content": "That's Llama!",
 "stop_reason": 'end_of_message',
 "tool_calls": []
 },
 {
 "role": "user",
 "content": 'Has a straight, slender neck and is smaller in size compared to its relative.'
 },
 {
 "role": "assistant",
 "content": "That's Alpaca!",
 "stop_reason": 'end_of_message',
 "tool_calls": []
 },
 {
 "role": "user",
 "content": 'Generally taller and more robust, commonly seen as guard animals.'
 }
]

#### Note
- **Few-Shot Examples**: These examples show the model the correct responses for specific prompts.
- **CompletionMessage**: This defines the model's expected completion for each prompt.


#### 3. Invoke `chat_completion` with Few-Shot Examples

Use the few-shot examples as the message input for `chat_completion`. The model will use the examples to generate contextually appropriate responses, allowing it to infer and complete new queries in a similar format.


In [4]:
response = client.inference.chat_completion(
 messages=few_shot_examples, model='Llama3.1-8B-Instruct'
)

#### 4. Display the Model’s Response

The `completion_message` contains the assistant’s generated content based on the few-shot examples provided. Output this content to see the model's response directly in the console.


In [5]:
from termcolor import cprint

cprint(f'> Response: {response.completion_message.content}', 'cyan')

[36m> Response: That's Llama![0m


### Complete code
Summing it up, here's the code for few-shot implementation with llama-stack:


In [6]:
from llama_stack_client import LlamaStackClient
from llama_stack_client.types import CompletionMessage, UserMessage
from termcolor import cprint

client = LlamaStackClient(base_url=f'http://{HOST}:{PORT}')

response = client.inference.chat_completion(
 messages=[
 {"role": "user", "content": 'Have shorter, spear-shaped ears.'},
 {
 "role": "assistant",
 "content": "That's Alpaca!",
 "stop_reason": 'end_of_message',
 "tool_calls": []
 },
 {
 "role": "user",
 "content": 'Known for their calm nature and used as pack animals in mountainous regions.'
 },
 {
 "role": "assistant",
 "content": "That's Llama!",
 "stop_reason": 'end_of_message',
 "tool_calls": []
 },
 {
 "role": "user",
 "content": 'Has a straight, slender neck and is smaller in size compared to its relative.'
 },
 {
 "role": "assistant",
 "content": "That's Alpaca!",
 "stop_reason": 'end_of_message',
 "tool_calls": []
 },
 {
 "role": "user",
 "content": 'Generally taller and more robust, commonly seen as guard animals.'
 }
],
 model='Llama3.2-11B-Vision-Instruct',
)

cprint(f'> Response: {response.completion_message.content}', 'cyan')

[36m> Response: That's Llama![0m


Thanks for checking out this notebook! 

The next one will be a guide on how to chat with images, continue to the notebook [here](./02_Image_Chat101.ipynb). Happy learning!