Added initial documentation for how to configure and launch llama stack

2025-12-04 18:13:44 +00:00 · 2025-09-15 12:10:42 -07:00 · 2025-09-15 12:10:42 -07:00 · 5900bce4a5
commit 5900bce4a5
parent b6cb817897
1 changed files with 284 additions and 0 deletions
--- a/docs/source/getting_started/configuring_and_launching_llama_stack.md
+++ b/docs/source/getting_started/configuring_and_launching_llama_stack.md
@ -0,0 +1,284 @@
+# Configuring and Launching Llama Stack
+
+This guide walks you through the two primary methods for setting up and running Llama Stack: using Docker containers and configuring the server manually.
+
+## Method 1: Using the Starter Docker Container
+
+The easiest way to get started with Llama Stack is using the pre-built Docker container. This approach eliminates the need for manual dependency management and provides a consistent environment across different systems.
+
+### Prerequisites
+
+- Docker installed and running on your system
+- Access to external model providers (e.g., Ollama running locally)
+
+### Basic Docker Usage
+
+Here's an example for spinning up the Llama Stack server using Docker:
+
+```bash
+docker run -it \
+  -v ~/.llama:/root/.llama \
+  --network=host \
+  llamastack/distribution-starter \
+  --env OLLAMA_URL=http://localhost:11434
+```
+
+### Docker Command Breakdown
+
+- `-it`: Run in interactive mode with TTY allocation
+- `-v ~/.llama:/root/.llama`: Mount your local Llama Stack configuration directory
+- `--network=host`: Use host networking to access local services like Ollama
+- `llamastack/distribution-starter`: The official Llama Stack Docker image
+- `--env OLLAMA_URL=http://localhost:11434`: Set environment variable for Ollama URL
+
+### Advanced Docker Configuration
+
+You can customize the Docker deployment with additional environment variables:
+
+```bash
+docker run -it \
+  -v ~/.llama:/root/.llama \
+  -p 8321:8321 \
+  -e OLLAMA_URL=http://localhost:11434 \
+  -e BRAVE_SEARCH_API_KEY=your_api_key_here \
+  -e TAVILY_SEARCH_API_KEY=your_api_key_here \
+  llamastack/distribution-starter \
+  --port 8321
+```
+
+### Environment Variables
+
+Common environment variables you can set:
+
+| Variable | Description | Example |
+|----------|-------------|---------|
+| `OLLAMA_URL` | URL for Ollama service | `http://localhost:11434` |
+| `BRAVE_SEARCH_API_KEY` | API key for Brave search | `your_brave_api_key` |
+| `TAVILY_SEARCH_API_KEY` | API key for Tavily search | `your_tavily_api_key` |
+| `TOGETHER_API_KEY` | API key for Together AI | `your_together_api_key` |
+| `OPENAI_API_KEY` | API key for OpenAI | `your_openai_api_key` |
+
+## Method 2: Manual Server Configuration and Launch
+
+For more control over your Llama Stack deployment, you can configure and run the server manually.
+
+### Prerequisites
+
+1. **Install Llama Stack**:
+   ```bash
+   pip install llama-stack
+   ```
+
+2. **Install Provider Dependencies** (as needed):
+   ```bash
+
+   # For vector operations
+   pip install faiss-cpu
+
+   # For database operations
+   pip install sqlalchemy aiosqlite asyncpg
+   ```
+
+### Step 1: Build a Distribution
+
+Choose a distro and build your Llama Stack distribution:
+
+```bash
+# List available distributions
+llama stack build --list-distros
+
+# Build with a specific distro
+llama stack build --distro watsonx --image-type venv --image-name watsonx-stack
+
+# Or build with a meta-reference distro
+llama stack build --distro  meta-reference-gpu --image-type venv --image-name  meta-reference-gpu-stack
+```
+
+### Select Available Distributions
+
+- **dell**: Dell's distribution for Llama Stack.
+- **open-benchmark**: Distribution for running open benchmarks.
+- **watsonx**: For IBM Watson integration.
+- **starter**: Basic distribution with essential providers.
+
+### Step 2: Configure Your Stack
+
+After building, you can customize the configuration files:
+
+#### Configuration File Locations
+
+- Build config: `~/.llama/distributions/{stack-name}/{stack-name}-build.yaml`
+- Runtime config: `~/.llama/distributions/{stack-name}/{stack-name}-run.yaml`
+
+#### Sample Runtime Configuration
+
+```yaml
+version: 2
+
+apis:
+- inference
+- safety
+- embeddings
+- tool_runtime
+
+providers:
+  inference:
+  - provider_id: ollama
+    provider_type: remote::ollama
+    config:
+      url: http://localhost:11434
+
+  safety:
+  - provider_id: llama-guard
+    provider_type: remote::ollama
+    config:
+      url: http://localhost:11434
+
+  embeddings:
+  - provider_id: ollama-embeddings
+    provider_type: remote::ollama
+    config:
+      url: http://localhost:11434
+
+  tool_runtime:
+  - provider_id: brave-search
+    provider_type: remote::brave-search
+    config:
+      api_key: ${env.BRAVE_SEARCH_API_KEY:=}
+```
+
+### Step 3: Launch the Server
+
+Start your configured Llama Stack server:
+
+```bash
+# Run with specific port
+llama stack run {stack-name} --port 8321
+
+# Run with environment variables
+OLLAMA_URL=http://localhost:11434 llama stack run starter --port 8321
+
+# Run in background
+nohup llama stack run starter --port 8321 > llama_stack.log 2>&1 &
+```
+
+### Step 4: Verify Installation
+
+Test your Llama Stack server:
+
+```bash
+# Check server health
+curl http://localhost:8321/health
+
+# List available models
+curl http://localhost:8321/v1/models
+
+# Test chat completion
+curl -X POST http://localhost:8321/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "llama3.1:8b",
+    "messages": [{"role": "user", "content": "Hello!"}]
+  }'
+```
+
+## Configuration Management
+
+### Managing Multiple Stacks
+
+You can maintain multiple stack configurations:
+
+```bash
+# List all built stacks
+llama stack list
+
+# Remove a stack
+llama stack rm {stack-name}
+
+# Rebuild with updates
+llama stack build --distro starter --image-type venv --image-name starter-v2
+```
+
+### Common Configuration Issues
+
+#### Files Provider Missing
+
+If you encounter "files provider not available" errors:
+
+1. Add files API to your configuration:
+   ```yaml
+   apis:
+   - files  # Add this line
+   - inference
+   - safety
+   ```
+
+2. Add files provider:
+   ```yaml
+   providers:
+     files:
+     - provider_id: localfs
+       provider_type: inline::localfs
+       config:
+         kvstore:
+           type: sqlite
+           db_path: ~/.llama/files_store.db
+   ```
+
+#### Port Conflicts
+
+If port 8321 is already in use:
+
+```bash
+# Check what's using the port
+netstat -tlnp | grep :8321
+
+# Use a different port
+llama stack run starter --port 8322
+```
+
+## Troubleshooting
+
+### Common Issues
+
+1. **Docker Permission Denied**:
+   ```bash
+   sudo docker run -it \
+     -v ~/.llama:/root/.llama \
+     --network=host \
+     llamastack/distribution-starter
+   ```
+
+2. **Module Not Found Errors**:
+   ```bash
+   # Install missing dependencies
+   pip install ibm-watsonx-ai faiss-cpu sqlalchemy aiosqlite
+   ```
+
+3. **Provider Connection Issues**:
+   - Verify external services (Ollama, APIs) are running
+   - Check network connectivity and firewall settings
+   - Validate API keys and URLs
+
+### Logs and Debugging
+
+Enable detailed logging:
+
+```bash
+# Run with debug logging
+llama stack run starter --port 8321 --log-level DEBUG
+
+# Check logs in Docker
+docker logs <container-id>
+```
+
+## Next Steps
+
+Once your Llama Stack server is running:
+
+1. **Explore the APIs**: Test inference, safety, and embeddings endpoints
+2. **Integrate with Applications**: Use the server with LangChain, custom applications, or API clients
+3. **Scale Your Deployment**: Consider load balancing and high-availability setups
+4. **Monitor Performance**: Set up logging and monitoring for production use
+
+For more advanced configurations and production deployments, refer to the [Advanced Configuration Guide](advanced_configuration.md) and [Production Deployment Guide](production_deployment.md).