# Ollama Quickstart Guide This guide will walk you through setting up an end-to-end workflow with Llama Stack with ollama, enabling you to perform text generation using the `Llama3.2-1B-Instruct` model. Follow these steps to get started quickly. If you're looking for more specific topics like tool calling or agent setup, we have a [Zero to Hero Guide](#next-steps) that covers everything from Tool Calling to Agents in detail. Feel free to skip to the end to explore the advanced topics you're interested in. > If you'd prefer not to set up a local server, explore our notebook on [tool calling with the Together API](Tool_Calling101_Using_Together's_Llama_Stack_Server.ipynb). This guide will show you how to leverage Together.ai's Llama Stack Server API, allowing you to get started with Llama Stack without the need for a locally built and running server. ## Table of Contents 1. [Setup ollama](#setup-ollama) 2. [Install Dependencies and Set Up Environment](#install-dependencies-and-set-up-environment) 3. [Build, Configure, and Run Llama Stack](#build-configure-and-run-llama-stack) 4. [Run Ollama Model](#run-ollama-model) 5. [Next Steps](#next-steps) --- ## Setup ollama 1. **Download Ollama App**: - Go to [https://ollama.com/download](https://ollama.com/download). - Download and unzip `Ollama-darwin.zip`. - Run the `Ollama` application. 2. **Download the Ollama CLI**: - Ensure you have the `ollama` command line tool by downloading and installing it from the same website. 3. **Verify Installation**: - Open the terminal and run: ```bash ollama run llama3.2:1b ``` --- ## Install Dependencies and Set Up Environment 1. **Create a Conda Environment**: - Create a new Conda environment with Python 3.11: ```bash conda create -n hack python=3.11 ``` - Activate the environment: ```bash conda activate hack ``` 2. **Install ChromaDB**: - Install `chromadb` using `pip`: ```bash pip install chromadb ``` 3. **Run ChromaDB**: - Start the ChromaDB server: ```bash chroma run --host localhost --port 8000 --path ./my_chroma_data ``` 4. **Install Llama Stack**: - Open a new terminal and install `llama-stack`: ```bash conda activate hack pip install llama-stack ``` --- ## Build, Configure, and Run Llama Stack 1. **Build the Llama Stack**: - Build the Llama Stack using the `ollama` template: ```bash llama stack build --template ollama --image-type conda ``` 2. **Edit Configuration**: - Modify the `ollama-run.yaml` file located at `/Users/yourusername/.llama/distributions/llamastack-ollama/ollama-run.yaml`: - Change the `chromadb` port to `8000`. - Remove the `pgvector` section if present. 3. **Run the Llama Stack**: - Run the stack with the configured YAML file: ```bash llama stack run /path/to/your/distro/llamastack-ollama/ollama-run.yaml --port 5050 ``` The server will start and listen on `http://localhost:5050`. --- ## Testing with `curl` After setting up the server, open a new terminal window and verify it's working by sending a `POST` request using `curl`: ```bash curl http://localhost:5050/inference/chat_completion \ -H "Content-Type: application/json" \ -d '{ "model": "llama3.2:1b", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write me a 2-sentence poem about the moon"} ], "sampling_params": {"temperature": 0.7, "seed": 42, "max_tokens": 512} }' ``` **Expected Output:** ```json { "completion_message": { "role": "assistant", "content": "The moon glows softly in the midnight sky,\nA beacon of wonder, as it catches the eye.", "stop_reason": "out_of_tokens", "tool_calls": [] }, "logprobs": null } ``` --- ## Testing with Python You can also interact with the Llama Stack server using a simple Python script. Below is an example: ### 1. Active Conda Environment and Install Required Python Packages The `llama-stack-client` library offers a robust and efficient python methods for interacting with the Llama Stack server. ```bash conda activate your-llama-stack-conda-env pip install llama-stack-client ``` ### 2. Create Python Script (`test_llama_stack.py`) ```bash touch test_llama_stack.py ``` ### 3. Create a Chat Completion Request in Python ```python from llama_stack_client import LlamaStackClient # Initialize the client client = LlamaStackClient(base_url="http://localhost:5050") # Create a chat completion request response = client.inference.chat_completion( messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Write a two-sentence poem about llama."} ], model="llama3.2:1b", ) # Print the response print(response.completion_message.content) ``` ### 4. Run the Python Script ```bash python test_llama_stack.py ``` **Expected Output:** ``` The moon glows softly in the midnight sky, A beacon of wonder, as it catches the eye. ``` With these steps, you should have a functional Llama Stack setup capable of generating text using the specified model. For more detailed information and advanced configurations, refer to some of our documentation below. This command initializes the model to interact with your local Llama Stack instance. --- ## Next Steps **Explore Other Guides**: Dive deeper into specific topics by following these guides: - [Understanding Distribution](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html#decide-your-inference-provider) - [Inference 101](00_Inference101.ipynb) - [Local and Cloud Model Toggling 101](00_Local_Cloud_Inference101.ipynb) - [Prompt Engineering](01_Prompt_Engineering101.ipynb) - [Chat with Image - LlamaStack Vision API](02_Image_Chat101.ipynb) - [Tool Calling: How to and Details](03_Tool_Calling101.ipynb) - [Memory API: Show Simple In-Memory Retrieval](04_Memory101.ipynb) - [Using Safety API in Conversation](05_Safety101.ipynb) - [Agents API: Explain Components](06_Agents101.ipynb) **Explore Client SDKs**: Utilize our client SDKs for various languages to integrate Llama Stack into your applications: - [Python SDK](https://github.com/meta-llama/llama-stack-client-python) - [Node SDK](https://github.com/meta-llama/llama-stack-client-node) - [Swift SDK](https://github.com/meta-llama/llama-stack-client-swift) - [Kotlin SDK](https://github.com/meta-llama/llama-stack-client-kotlin) **Advanced Configuration**: Learn how to customize your Llama Stack distribution by referring to the [Building a Llama Stack Distribution](./building_distro.md) guide. **Explore Example Apps**: Check out [llama-stack-apps](https://github.com/meta-llama/llama-stack-apps/tree/main/examples) for example applications built using Llama Stack. ---