diff --git a/docs/source/quickstart.md b/docs/source/quickstart.md index 465e2be27..a96d35c3b 100644 --- a/docs/source/quickstart.md +++ b/docs/source/quickstart.md @@ -1,58 +1,82 @@ +# Llama Stack Quickstart Guide -# Quickstart +This guide will walk you through setting up an end-to-end workflow with Llama Stack, enabling you to perform text generation using the `Llama3.2-11B-Vision-Instruct` model. Follow these steps to get started quickly. -This guide will walk you through the steps to set up an end-to-end workflow with Llama Stack. It focuses on building a Llama Stack distribution and starting up a Llama Stack server. See our [documentation](../README.md) for more on Llama Stack's capabilities, or visit [llama-stack-apps](https://github.com/meta-llama/llama-stack-apps/tree/main) for example apps. +## Table of Contents +1. [Prerequisite](#prerequisite) +2. [Installation](#installation) +3. [Download Llama Models](#download-llama-models) +4. [Build, Configure, and Run Llama Stack](#build-configure-and-run-llama-stack) +5. [Testing with `curl`](#testing-with-curl) +6. [Testing with Python](#testing-with-python) +7. [Next Steps](#next-steps) + +--- + +## Prerequisite + +Ensure you have the following installed on your system: + +- **Conda**: A package, dependency, and environment management tool. -## 0. Prerequsite -Feel free to skip this step if you already have the prerequsite installed. +--- -1. conda (steps to install) -2. +## Installation + +The `llama` CLI tool helps you manage the Llama Stack toolchain and agent systems. + +**Install via PyPI:** + +```bash +pip install llama-stack +``` + +*After installation, the `llama` command should be available in your PATH.* + +--- + +## Download Llama Models + +Download the necessary Llama model checkpoints using the `llama` CLI: + +```bash +llama download --model-id Llama3.2-11B-Vision-Instruct +``` + +*Follow the CLI prompts to complete the download. You may need to accept a license agreement. Obtain an instant license [here](https://www.llama.com/llama-downloads/).* + +--- + +## Build, Configure, and Run Llama Stack + +### 1. Build the Llama Stack Distribution + +We will default into building a `meta-reference-gpu` distribution, however you could read more about the different distriubtion [here](https://llama-stack.readthedocs.io/en/latest/getting_started/distributions/index.html). + +```bash +llama stack build --template meta-reference-gpu --image-type conda +``` -## 1. Installation +### 2. Run the Llama Stack Distribution +> Launching a distribution initializes and configures the necessary APIs and Providers, enabling seamless interaction with the underlying model. -The `llama` CLI tool helps you manage the Llama toolchain & agentic systems. After installing the `llama-stack` package, the `llama` command should be available in your path. +Start the server with the configured stack: -**Install as a package**: - Install directly from [PyPI](https://pypi.org/project/llama-stack/) with: - ```bash - pip install llama-stack - ``` +```bash +cd llama-stack/distributions/meta-reference-gpu +llama stack run ./run.yaml +``` -## 2. Download Llama models: +*The server will start and listen on `http://localhost:5000` by default.* +--- - ``` - llama download --model-id Llama3.1-8B-Instruct - ``` - You will have to follow the instructions in the cli to complete the download, get a instant license here: URL to license. +## Testing with `curl` -## 3. Build->Configure->Run via Conda: - For development, build a LlamaStack distribution from scratch. +After setting up the server, verify it's working by sending a `POST` request using `curl`: - **`llama stack build`** - Enter build information interactively: - ```bash - llama stack build - ``` - - **`llama stack configure`** - Run `llama stack configure ` using the name from the build step. - ```bash - llama stack configure my-local-stack - ``` - - **`llama stack run`** - Start the server with: - ```bash - llama stack run my-local-stack - ``` - -## 4. Testing with Client - -After setup, test the server with a POST request: ```bash curl http://localhost:5000/inference/chat_completion \ -H "Content-Type: application/json" \ @@ -66,34 +90,95 @@ curl http://localhost:5000/inference/chat_completion \ }' ``` - -## 5. Inference - -After setup, test the server with a POST request: -```bash -curl http://localhost:5000/inference/chat_completion \ --H "Content-Type: application/json" \ --d '{ - "model": "Llama3.1-8B-Instruct", - "messages": [ - {"role": "system", "content": "You are a helpful assistant."}, - {"role": "user", "content": "Write me a 2-sentence poem about the moon"} - ], - "sampling_params": {"temperature": 0.7, "seed": 42, "max_tokens": 512} -}' +**Expected Output:** +```json +{ + "completion_message": { + "role": "assistant", + "content": "The moon glows softly in the midnight sky,\nA beacon of wonder, as it catches the eye.", + "stop_reason": "out_of_tokens", + "tool_calls": [] + }, + "logprobs": null +} ``` +--- + +## Testing with Python + +You can also interact with the Llama Stack server using a simple Python script. Below is an example: + +### 1. Install Required Python Packages +The `llama-stack-client` library offers a robust and efficient python methods for interacting with the Llama Stack server. + +```bash +pip install llama-stack-client +``` + +### 2. Create a Python Script (`test_llama_stack.py`) + +```python +from llama_stack_client import LlamaStackClient +from llama_stack_client.types import SystemMessage, UserMessage + +# Initialize the client +client = LlamaStackClient(base_url="http://localhost:5000") + +# Create a chat completion request +response = client.inference.chat_completion( + messages=[ + SystemMessage(content="You are a helpful assistant.", role="system"), + UserMessage(content="Write me a 2-sentence poem about the moon", role="user") + ], + model="Llama3.1-8B-Instruct", +) + +# Print the response +print(response.completion_message.content) +``` + +### 3. Run the Python Script + +```bash +python test_llama_stack.py +``` + +**Expected Output:** +``` +The moon glows softly in the midnight sky, +A beacon of wonder, as it catches the eye. +``` + +With these steps, you should have a functional Llama Stack setup capable of generating text using the specified model. For more detailed information and advanced configurations, refer to some of our documentation below. + +--- + +## Next Steps + +- **Explore Other Guides**: Dive deeper into specific topics by following these guides: + - [Understanding Distributions](#) + - [Configure your Distro](#) + - [Doing Inference API Call and Fetching a Response from Endpoints](#) + - [Creating a Conversation Loop](#) + - [Sending Image to the Model](#) + - [Tool Calling: How to and Details](#) + - [Memory API: Show Simple In-Memory Retrieval](#) + - [Agents API: Explain Components](#) + - [Using Safety API in Conversation](#) + - [Prompt Engineering Guide](#) + +- **Explore Client SDKs**: Utilize our client SDKs for various languages to integrate Llama Stack into your applications: + - [Python SDK](https://github.com/meta-llama/llama-stack-client-python) + - [Node SDK](https://github.com/meta-llama/llama-stack-client-node) + - [Swift SDK](https://github.com/meta-llama/llama-stack-client-swift) + - [Kotlin SDK](https://github.com/meta-llama/llama-stack-client-kotlin) + +- **Advanced Configuration**: Learn how to customize your Llama Stack distribution by referring to the [Building a Llama Stack Distribution](./building_distro.md) guide. + +- **Explore Example Apps**: Check out [llama-stack-apps](https://github.com/meta-llama/llama-stack-apps/tree/main/examples) for example applications built using Llama Stack. -Check our client SDKs for various languages: [Python](https://github.com/meta-llama/llama-stack-client-python), [Node](https://github.com/meta-llama/llama-stack-client-node), [Swift](https://github.com/meta-llama/llama-stack-client-swift), and [Kotlin](https://github.com/meta-llama/llama-stack-client-kotlin). - -## Advanced Guides - -For more on custom Llama Stack distributions, refer to our [Building a Llama Stack Distribution](./building_distro.md) guide. +--- -## Next Steps: -check out - -1. -2.