## Getting Started with LlamaStack Vision API

Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html).

Let's import the necessary packages

In [1]:
import asyncio
import base64
import mimetypes
from llama_stack_client import LlamaStackClient
from termcolor import cprint

## Configuration
Set up your connection parameters:

In [None]:
HOST = "localhost"  # Replace with your host
PORT = 8321         # Replace with your cloud distro port
MODEL_NAME='meta-llama/Llama3.2-11B-Vision-Instruct'

## Helper Functions
Let's create some utility functions to handle image processing and API interaction:

In [None]:
def encode_image_to_data_url(file_path: str) -> str:
    """
    Encode an image file to a data URL.

    Args:
        file_path (str): Path to the image file

    Returns:
        str: Data URL string
    """
    mime_type, _ = mimetypes.guess_type(file_path)
    if mime_type is None:
        raise ValueError("Could not determine MIME type of the file")

    with open(file_path, "rb") as image_file:
        encoded_string = base64.b64encode(image_file.read()).decode("utf-8")

    return f"data:{mime_type};base64,{encoded_string}"

async def process_image(client, image_path: str, stream: bool = True):
    """
    Process an image through the LlamaStack Vision API.

    Args:
        client (LlamaStackClient): Initialized client
        image_path (str): Path to image file
        stream (bool): Whether to stream the response
    """
    data_url = encode_image_to_data_url(image_path)

    message = {
        "role": "user",
        "content": [
            {"type": "image", "image": {"url": {"uri": data_url}}},
            {"type": "text", "text": "Describe what is in this image."}
        ]
    }

    cprint("User> Sending image for analysis...", "green")
    response = client.inference.chat_completion(
        messages=[message],
        model_id=MODEL_NAME,
        stream=stream,
    )

    cprint(f'Assistant> ', color='cyan', end='')
    if not stream:
        cprint(response.completion_message.content, color='yellow')
    else:
        for chunk in response:
            cprint(chunk.event.delta.text, color='yellow', end='')
        cprint('')


## Chat with Image

Now let's put it all together:

In [None]:
# [Cell 5] - Initialize client and process image
async def main():
    # Initialize client
    client = LlamaStackClient(
        base_url=f"http://{HOST}:{PORT}",
    )

    # Process image
    await process_image(client, "../_static/llama-stack-logo.png")



# Execute the main function
await main()

Thanks for checking out this notebook! 

The next one in the series will teach you one of the favorite applications of Large Language Models: [Tool Calling](./04_Tool_Calling101.ipynb). Enjoy!