## Getting Started with LlamaStack Vision API

Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html).

Let's import the necessary packages

In [1]:
import asyncio
import base64
import mimetypes
from llama_stack_client import LlamaStackClient
from llama_stack_client.lib.inference.event_logger import EventLogger
from llama_stack_client.types import UserMessage
from termcolor import cprint

## Configuration
Set up your connection parameters:

In [None]:
HOST = "localhost"  # Replace with your host
CLOUD_PORT = 5001       # Replace with your cloud distro port
MODEL_NAME='Llama3.2-11B-Vision-Instruct'

## Helper Functions
Let's create some utility functions to handle image processing and API interaction:

In [None]:
import base64
import mimetypes
from termcolor import cprint
from llama_stack_client.lib.inference.event_logger import EventLogger

def encode_image_to_data_url(file_path: str) -> str:
    """
    Encode an image file to a data URL.

    Args:
        file_path (str): Path to the image file

    Returns:
        str: Data URL string
    """
    mime_type, _ = mimetypes.guess_type(file_path)
    if mime_type is None:
        raise ValueError("Could not determine MIME type of the file")

    with open(file_path, "rb") as image_file:
        encoded_string = base64.b64encode(image_file.read()).decode("utf-8")

    return f"data:{mime_type};base64,{encoded_string}"

async def process_image(client, image_path: str, stream: bool = True):
    """
    Process an image through the LlamaStack Vision API.

    Args:
        client (LlamaStackClient): Initialized client
        image_path (str): Path to image file
        stream (bool): Whether to stream the response
    """
    data_url = encode_image_to_data_url(image_path)

    message = {
        "role": "user",
        "content": [
            {"image": {"uri": data_url}},
            "Describe what is in this image."
        ]
    }

    cprint("User> Sending image for analysis...", "green")
    response = client.inference.chat_completion(
        messages=[message],
        model_id=MODEL_NAME,
        stream=stream,
    )

    if not stream:
        cprint(f"> Response: {response}", "cyan")
    else:
        async for log in EventLogger().log(response):
            log.print()


## Chat with Image

Now let's put it all together:

In [6]:
# [Cell 5] - Initialize client and process image
async def main():
    # Initialize client
    client = LlamaStackClient(
        base_url=f"http://{HOST}:{PORT}",
    )

    # Process image
    await process_image(client, "../_static/llama-stack-logo.png")



# Execute the main function
await main()

[32mUser> Sending image for analysis...[0m
[36mAssistant> [0m[33mThe[0m[33m image[0m[33m features[0m[33m a[0m[33m simple[0m[33m,[0m[33m mon[0m[33moch[0m[33mromatic[0m[33m line[0m[33m drawing[0m[33m of[0m[33m a[0m[33m llama[0m[33m,[0m[33m with[0m[33m the[0m[33m words[0m[33m "[0m[33mLL[0m[33mAMA[0m[33m STACK[0m[33m"[0m[33m written[0m[33m above[0m[33m it[0m[33m.[0m[33m The[0m[33m llama[0m[33m is[0m[33m depicted[0m[33m in[0m[33m a[0m[33m cartoon[0m[33mish[0m[33m style[0m[33m,[0m[33m with[0m[33m a[0m[33m large[0m[33m body[0m[33m and[0m[33m a[0m[33m long[0m[33m neck[0m[33m.[0m[33m It[0m[33m has[0m[33m a[0m[33m distinctive[0m[33m head[0m[33m shape[0m[33m,[0m[33m with[0m[33m a[0m[33m small[0m[33m circle[0m[33m for[0m[33m the[0m[33m eye[0m[33m and[0m[33m a[0m[33m curved[0m[33m line[0m[33m for[0m[33m the[0m[33m mouth[0m[33m.[0m[33m The[0m[33m llama[0m[3

Thanks for checking out this notebook! 

The next one in the series will teach you one of the favorite applications of Large Language Models: [Tool Calling](./04_Tool_Calling101.ipynb). Enjoy!