mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-07-30 23:51:00 +00:00
beef up quickstart
This commit is contained in:
parent
2898a9bc9e
commit
d200a6b002
1 changed files with 151 additions and 66 deletions
|
@ -1,58 +1,82 @@
|
||||||
|
# Llama Stack Quickstart Guide
|
||||||
|
|
||||||
# Quickstart
|
This guide will walk you through setting up an end-to-end workflow with Llama Stack, enabling you to perform text generation using the `Llama3.2-11B-Vision-Instruct` model. Follow these steps to get started quickly.
|
||||||
|
|
||||||
This guide will walk you through the steps to set up an end-to-end workflow with Llama Stack. It focuses on building a Llama Stack distribution and starting up a Llama Stack server. See our [documentation](../README.md) for more on Llama Stack's capabilities, or visit [llama-stack-apps](https://github.com/meta-llama/llama-stack-apps/tree/main) for example apps.
|
## Table of Contents
|
||||||
|
1. [Prerequisite](#prerequisite)
|
||||||
|
2. [Installation](#installation)
|
||||||
|
3. [Download Llama Models](#download-llama-models)
|
||||||
|
4. [Build, Configure, and Run Llama Stack](#build-configure-and-run-llama-stack)
|
||||||
|
5. [Testing with `curl`](#testing-with-curl)
|
||||||
|
6. [Testing with Python](#testing-with-python)
|
||||||
|
7. [Next Steps](#next-steps)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Prerequisite
|
||||||
|
|
||||||
|
Ensure you have the following installed on your system:
|
||||||
|
|
||||||
|
- **Conda**: A package, dependency, and environment management tool.
|
||||||
|
|
||||||
|
|
||||||
## 0. Prerequsite
|
---
|
||||||
Feel free to skip this step if you already have the prerequsite installed.
|
|
||||||
|
|
||||||
1. conda (steps to install)
|
## Installation
|
||||||
2.
|
|
||||||
|
The `llama` CLI tool helps you manage the Llama Stack toolchain and agent systems.
|
||||||
|
|
||||||
|
**Install via PyPI:**
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install llama-stack
|
||||||
|
```
|
||||||
|
|
||||||
|
*After installation, the `llama` command should be available in your PATH.*
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Download Llama Models
|
||||||
|
|
||||||
|
Download the necessary Llama model checkpoints using the `llama` CLI:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
llama download --model-id Llama3.2-11B-Vision-Instruct
|
||||||
|
```
|
||||||
|
|
||||||
|
*Follow the CLI prompts to complete the download. You may need to accept a license agreement. Obtain an instant license [here](https://www.llama.com/llama-downloads/).*
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Build, Configure, and Run Llama Stack
|
||||||
|
|
||||||
|
### 1. Build the Llama Stack Distribution
|
||||||
|
|
||||||
|
We will default into building a `meta-reference-gpu` distribution, however you could read more about the different distriubtion [here](https://llama-stack.readthedocs.io/en/latest/getting_started/distributions/index.html).
|
||||||
|
|
||||||
|
```bash
|
||||||
|
llama stack build --template meta-reference-gpu --image-type conda
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
## 1. Installation
|
### 2. Run the Llama Stack Distribution
|
||||||
|
> Launching a distribution initializes and configures the necessary APIs and Providers, enabling seamless interaction with the underlying model.
|
||||||
|
|
||||||
The `llama` CLI tool helps you manage the Llama toolchain & agentic systems. After installing the `llama-stack` package, the `llama` command should be available in your path.
|
Start the server with the configured stack:
|
||||||
|
|
||||||
**Install as a package**:
|
```bash
|
||||||
Install directly from [PyPI](https://pypi.org/project/llama-stack/) with:
|
cd llama-stack/distributions/meta-reference-gpu
|
||||||
```bash
|
llama stack run ./run.yaml
|
||||||
pip install llama-stack
|
```
|
||||||
```
|
|
||||||
|
|
||||||
## 2. Download Llama models:
|
*The server will start and listen on `http://localhost:5000` by default.*
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
```
|
## Testing with `curl`
|
||||||
llama download --model-id Llama3.1-8B-Instruct
|
|
||||||
```
|
|
||||||
You will have to follow the instructions in the cli to complete the download, get a instant license here: URL to license.
|
|
||||||
|
|
||||||
## 3. Build->Configure->Run via Conda:
|
After setting up the server, verify it's working by sending a `POST` request using `curl`:
|
||||||
For development, build a LlamaStack distribution from scratch.
|
|
||||||
|
|
||||||
**`llama stack build`**
|
|
||||||
Enter build information interactively:
|
|
||||||
```bash
|
|
||||||
llama stack build
|
|
||||||
```
|
|
||||||
|
|
||||||
**`llama stack configure`**
|
|
||||||
Run `llama stack configure <name>` using the name from the build step.
|
|
||||||
```bash
|
|
||||||
llama stack configure my-local-stack
|
|
||||||
```
|
|
||||||
|
|
||||||
**`llama stack run`**
|
|
||||||
Start the server with:
|
|
||||||
```bash
|
|
||||||
llama stack run my-local-stack
|
|
||||||
```
|
|
||||||
|
|
||||||
## 4. Testing with Client
|
|
||||||
|
|
||||||
After setup, test the server with a POST request:
|
|
||||||
```bash
|
```bash
|
||||||
curl http://localhost:5000/inference/chat_completion \
|
curl http://localhost:5000/inference/chat_completion \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
|
@ -66,34 +90,95 @@ curl http://localhost:5000/inference/chat_completion \
|
||||||
}'
|
}'
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**Expected Output:**
|
||||||
## 5. Inference
|
```json
|
||||||
|
{
|
||||||
After setup, test the server with a POST request:
|
"completion_message": {
|
||||||
```bash
|
"role": "assistant",
|
||||||
curl http://localhost:5000/inference/chat_completion \
|
"content": "The moon glows softly in the midnight sky,\nA beacon of wonder, as it catches the eye.",
|
||||||
-H "Content-Type: application/json" \
|
"stop_reason": "out_of_tokens",
|
||||||
-d '{
|
"tool_calls": []
|
||||||
"model": "Llama3.1-8B-Instruct",
|
},
|
||||||
"messages": [
|
"logprobs": null
|
||||||
{"role": "system", "content": "You are a helpful assistant."},
|
}
|
||||||
{"role": "user", "content": "Write me a 2-sentence poem about the moon"}
|
|
||||||
],
|
|
||||||
"sampling_params": {"temperature": 0.7, "seed": 42, "max_tokens": 512}
|
|
||||||
}'
|
|
||||||
```
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Testing with Python
|
||||||
|
|
||||||
|
You can also interact with the Llama Stack server using a simple Python script. Below is an example:
|
||||||
|
|
||||||
|
### 1. Install Required Python Packages
|
||||||
|
The `llama-stack-client` library offers a robust and efficient python methods for interacting with the Llama Stack server.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install llama-stack-client
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Create a Python Script (`test_llama_stack.py`)
|
||||||
|
|
||||||
|
```python
|
||||||
|
from llama_stack_client import LlamaStackClient
|
||||||
|
from llama_stack_client.types import SystemMessage, UserMessage
|
||||||
|
|
||||||
|
# Initialize the client
|
||||||
|
client = LlamaStackClient(base_url="http://localhost:5000")
|
||||||
|
|
||||||
|
# Create a chat completion request
|
||||||
|
response = client.inference.chat_completion(
|
||||||
|
messages=[
|
||||||
|
SystemMessage(content="You are a helpful assistant.", role="system"),
|
||||||
|
UserMessage(content="Write me a 2-sentence poem about the moon", role="user")
|
||||||
|
],
|
||||||
|
model="Llama3.1-8B-Instruct",
|
||||||
|
)
|
||||||
|
|
||||||
|
# Print the response
|
||||||
|
print(response.completion_message.content)
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Run the Python Script
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python test_llama_stack.py
|
||||||
|
```
|
||||||
|
|
||||||
|
**Expected Output:**
|
||||||
|
```
|
||||||
|
The moon glows softly in the midnight sky,
|
||||||
|
A beacon of wonder, as it catches the eye.
|
||||||
|
```
|
||||||
|
|
||||||
|
With these steps, you should have a functional Llama Stack setup capable of generating text using the specified model. For more detailed information and advanced configurations, refer to some of our documentation below.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Next Steps
|
||||||
|
|
||||||
|
- **Explore Other Guides**: Dive deeper into specific topics by following these guides:
|
||||||
|
- [Understanding Distributions](#)
|
||||||
|
- [Configure your Distro](#)
|
||||||
|
- [Doing Inference API Call and Fetching a Response from Endpoints](#)
|
||||||
|
- [Creating a Conversation Loop](#)
|
||||||
|
- [Sending Image to the Model](#)
|
||||||
|
- [Tool Calling: How to and Details](#)
|
||||||
|
- [Memory API: Show Simple In-Memory Retrieval](#)
|
||||||
|
- [Agents API: Explain Components](#)
|
||||||
|
- [Using Safety API in Conversation](#)
|
||||||
|
- [Prompt Engineering Guide](#)
|
||||||
|
|
||||||
|
- **Explore Client SDKs**: Utilize our client SDKs for various languages to integrate Llama Stack into your applications:
|
||||||
|
- [Python SDK](https://github.com/meta-llama/llama-stack-client-python)
|
||||||
|
- [Node SDK](https://github.com/meta-llama/llama-stack-client-node)
|
||||||
|
- [Swift SDK](https://github.com/meta-llama/llama-stack-client-swift)
|
||||||
|
- [Kotlin SDK](https://github.com/meta-llama/llama-stack-client-kotlin)
|
||||||
|
|
||||||
|
- **Advanced Configuration**: Learn how to customize your Llama Stack distribution by referring to the [Building a Llama Stack Distribution](./building_distro.md) guide.
|
||||||
|
|
||||||
|
- **Explore Example Apps**: Check out [llama-stack-apps](https://github.com/meta-llama/llama-stack-apps/tree/main/examples) for example applications built using Llama Stack.
|
||||||
|
|
||||||
|
|
||||||
Check our client SDKs for various languages: [Python](https://github.com/meta-llama/llama-stack-client-python), [Node](https://github.com/meta-llama/llama-stack-client-node), [Swift](https://github.com/meta-llama/llama-stack-client-swift), and [Kotlin](https://github.com/meta-llama/llama-stack-client-kotlin).
|
---
|
||||||
|
|
||||||
## Advanced Guides
|
|
||||||
|
|
||||||
For more on custom Llama Stack distributions, refer to our [Building a Llama Stack Distribution](./building_distro.md) guide.
|
|
||||||
|
|
||||||
|
|
||||||
## Next Steps:
|
|
||||||
check out
|
|
||||||
|
|
||||||
1.
|
|
||||||
2.
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue