Added initial documentation for how to configure and launch llama stack

2025-12-04 10:10:36 +00:00 · 2025-09-15 12:10:42 -07:00 · 2025-09-15 12:10:42 -07:00 · 5900bce4a5
commit 5900bce4a5
parent b6cb817897
1 changed files with 284 additions and 0 deletions
--- a/docs/source/getting_started/configuring_and_launching_llama_stack.md
+++ b/docs/source/getting_started/configuring_and_launching_llama_stack.md
@ -0,0 +1,284 @@
 # Configuring and Launching Llama Stack
 This guide walks you through the two primary methods for setting up and running Llama Stack: using Docker containers and configuring the server manually.
 ## Method 1: Using the Starter Docker Container
 The easiest way to get started with Llama Stack is using the pre-built Docker container. This approach eliminates the need for manual dependency management and provides a consistent environment across different systems.
 ### Prerequisites
 - Docker installed and running on your system
 - Access to external model providers (e.g., Ollama running locally)
 ### Basic Docker Usage
 Here's an example for spinning up the Llama Stack server using Docker:
 ```bash
 docker run -it \
  -v ~/.llama:/root/.llama \
  --network=host \
  llamastack/distribution-starter \
  --env OLLAMA_URL=http://localhost:11434
 ```
 ### Docker Command Breakdown
 - `-it`: Run in interactive mode with TTY allocation
 - `-v ~/.llama:/root/.llama`: Mount your local Llama Stack configuration directory
 - `--network=host`: Use host networking to access local services like Ollama
 - `llamastack/distribution-starter`: The official Llama Stack Docker image
 - `--env OLLAMA_URL=http://localhost:11434`: Set environment variable for Ollama URL
 ### Advanced Docker Configuration
 You can customize the Docker deployment with additional environment variables:
 ```bash
 docker run -it \
  -v ~/.llama:/root/.llama \
  -p 8321:8321 \
  -e OLLAMA_URL=http://localhost:11434 \
  -e BRAVE_SEARCH_API_KEY=your_api_key_here \
  -e TAVILY_SEARCH_API_KEY=your_api_key_here \
  llamastack/distribution-starter \
  --port 8321
 ```
 ### Environment Variables
 Common environment variables you can set:
 | Variable | Description | Example |
 |----------|-------------|---------|
 | `OLLAMA_URL` | URL for Ollama service | `http://localhost:11434` |
 | `BRAVE_SEARCH_API_KEY` | API key for Brave search | `your_brave_api_key` |
 | `TAVILY_SEARCH_API_KEY` | API key for Tavily search | `your_tavily_api_key` |
 | `TOGETHER_API_KEY` | API key for Together AI | `your_together_api_key` |
 | `OPENAI_API_KEY` | API key for OpenAI | `your_openai_api_key` |
 ## Method 2: Manual Server Configuration and Launch
 For more control over your Llama Stack deployment, you can configure and run the server manually.
 ### Prerequisites
 1. **Install Llama Stack**:
   ```bash
   pip install llama-stack
   ```
 2. **Install Provider Dependencies** (as needed):
   ```bash
   # For vector operations
   pip install faiss-cpu
   # For database operations
   pip install sqlalchemy aiosqlite asyncpg
   ```
 ### Step 1: Build a Distribution
 Choose a distro and build your Llama Stack distribution:
 ```bash
 # List available distributions
 llama stack build --list-distros
 # Build with a specific distro
 llama stack build --distro watsonx --image-type venv --image-name watsonx-stack
 # Or build with a meta-reference distro
 llama stack build --distro  meta-reference-gpu --image-type venv --image-name  meta-reference-gpu-stack
 ```
 ### Select Available Distributions
 - **dell**: Dell's distribution for Llama Stack.
 - **open-benchmark**: Distribution for running open benchmarks.
 - **watsonx**: For IBM Watson integration.
 - **starter**: Basic distribution with essential providers.
 ### Step 2: Configure Your Stack
 After building, you can customize the configuration files:
 #### Configuration File Locations
 - Build config: `~/.llama/distributions/{stack-name}/{stack-name}-build.yaml`
 - Runtime config: `~/.llama/distributions/{stack-name}/{stack-name}-run.yaml`
 #### Sample Runtime Configuration
 ```yaml
 version: 2
 apis:
 - inference
 - safety
 - embeddings
 - tool_runtime
 providers:
  inference:
  - provider_id: ollama
    provider_type: remote::ollama
    config:
      url: http://localhost:11434
  safety:
  - provider_id: llama-guard
    provider_type: remote::ollama
    config:
      url: http://localhost:11434
  embeddings:
  - provider_id: ollama-embeddings
    provider_type: remote::ollama
    config:
      url: http://localhost:11434
  tool_runtime:
  - provider_id: brave-search
    provider_type: remote::brave-search
    config:
      api_key: ${env.BRAVE_SEARCH_API_KEY:=}
 ```
 ### Step 3: Launch the Server
 Start your configured Llama Stack server:
 ```bash
 # Run with specific port
 llama stack run {stack-name} --port 8321
 # Run with environment variables
 OLLAMA_URL=http://localhost:11434 llama stack run starter --port 8321
 # Run in background
 nohup llama stack run starter --port 8321 > llama_stack.log 2>&1 &
 ```
 ### Step 4: Verify Installation
 Test your Llama Stack server:
 ```bash
 # Check server health
 curl http://localhost:8321/health
 # List available models
 curl http://localhost:8321/v1/models
 # Test chat completion
 curl -X POST http://localhost:8321/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3.1:8b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'
 ```
 ## Configuration Management
 ### Managing Multiple Stacks
 You can maintain multiple stack configurations:
 ```bash
 # List all built stacks
 llama stack list
 # Remove a stack
 llama stack rm {stack-name}
 # Rebuild with updates
 llama stack build --distro starter --image-type venv --image-name starter-v2
 ```
 ### Common Configuration Issues
 #### Files Provider Missing
 If you encounter "files provider not available" errors:
 1. Add files API to your configuration:
   ```yaml
   apis:
   - files  # Add this line
   - inference
   - safety
   ```
 2. Add files provider:
   ```yaml
   providers:
     files:
     - provider_id: localfs
       provider_type: inline::localfs
       config:
         kvstore:
           type: sqlite
           db_path: ~/.llama/files_store.db
   ```
 #### Port Conflicts
 If port 8321 is already in use:
 ```bash
 # Check what's using the port
 netstat -tlnp | grep :8321
 # Use a different port
 llama stack run starter --port 8322
 ```
 ## Troubleshooting
 ### Common Issues
 1. **Docker Permission Denied**:
   ```bash
   sudo docker run -it \
     -v ~/.llama:/root/.llama \
     --network=host \
     llamastack/distribution-starter
   ```
 2. **Module Not Found Errors**:
   ```bash
   # Install missing dependencies
   pip install ibm-watsonx-ai faiss-cpu sqlalchemy aiosqlite
   ```
 3. **Provider Connection Issues**:
   - Verify external services (Ollama, APIs) are running
   - Check network connectivity and firewall settings
   - Validate API keys and URLs
 ### Logs and Debugging
 Enable detailed logging:
 ```bash
 # Run with debug logging
 llama stack run starter --port 8321 --log-level DEBUG
 # Check logs in Docker
 docker logs <container-id>
 ```
 ## Next Steps
 Once your Llama Stack server is running:
 1. **Explore the APIs**: Test inference, safety, and embeddings endpoints
 2. **Integrate with Applications**: Use the server with LangChain, custom applications, or API clients
 3. **Scale Your Deployment**: Consider load balancing and high-availability setups
 4. **Monitor Performance**: Set up logging and monitoring for production use
 For more advanced configurations and production deployments, refer to the [Advanced Configuration Guide](advanced_configuration.md) and [Production Deployment Guide](production_deployment.md).