phoenix-oss/llama-stack-mirror

Fork 1

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-10-04 20:14:13 +00:00

Omar Abdelwahab 849c12b9ac Added llama stack-langChain integration example scripts

2025-08-20 11:15:31 -07:00

7.7 KiB

Raw Blame History

LangChain + Llama Stack Document Processing

This repository contains two different implementations of document processing using LangChain and Llama Stack:

langchain_llamastack.py - Interactive CLI version
langchain_llamastack_ray.py - Ray Serve API version

Both versions provide AI-powered document processing capabilities including summarization, fact extraction, and question-answering.

📋 Prerequisites

System Requirements

Python 3.12+
Ray Serve (for API version)
Llama Stack server running on http://localhost:8321/
Ollama or compatible model server

Required Python Packages

pip install llama-stack-client langchain langchain-core langchain-community
pip install beautifulsoup4 markdownify readability-lxml requests
pip install ray[serve] starlette  # For Ray Serve version only

Environment Setup

# Create and activate virtual environment
python3.12 -m venv llama-env-py312
source llama-env-py312/bin/activate

# Install dependencies
pip install llama-stack-client langchain langchain-core langchain-community beautifulsoup4 markdownify readability-lxml requests ray[serve] starlette

🚀 Quick Start

Start Llama Stack Server

Before running either version, ensure your Llama Stack server is running:

# Start Llama Stack server (example)
llama stack run your-config --port 8321

📖 Option 1: Interactive CLI Version (`langchain_llamastack_updated.py`)

Features

✅ Interactive command-line interface
✅ Document loading from URLs and PDFs
✅ AI-powered summarization and fact extraction
✅ Question-answering based on document content
✅ Session-based document storage

How to Run

# Activate environment
source llama-env-py312/bin/activate

# Run the interactive CLI
cd /home/omara/langchain_llamastack
python langchain_llamastack_updated.py

Usage Commands

Once running, you can use these interactive commands:

🎯 Interactive Document Processing Demo
Commands:
  load <url_or_path>  - Process a document
  ask <question>      - Ask about the document
  summary            - Show document summary
  facts              - Show extracted facts
  help               - Show commands
  quit               - Exit demo

Example Session

> load https://en.wikipedia.org/wiki/Artificial_intelligence
📄 Loading document from: https://en.wikipedia.org/wiki/Artificial_intelligence
✅ Loaded 45,832 characters
📝 Generating summary...
🔍 Extracting key facts...
✅ Processing complete!

> summary
📝 Summary:
Artificial intelligence (AI) is the simulation of human intelligence...

> ask What are the main types of AI?
💬 Q: What are the main types of AI?
📝 A: Based on the document, the main types of AI include...

> facts
🔍 Key Facts:
- AI was founded as an academic discipline in 1956
- Machine learning is a subset of AI...

> quit
👋 Thanks for exploring LangChain chains!

🌐 Option 2: Ray Serve API Version (`langchain_llamastack_ray.py`)

Features

✅ RESTful HTTP API
✅ Persistent service (runs indefinitely)
✅ Multiple endpoints for different operations
✅ JSON request/response format
✅ Concurrent request handling

How to Run

# Activate environment
source llama-env-py312/bin/activate

# Start the Ray Serve API
cd /home/omara/langchain_llamastack
python langchain_llamastack_ray.py

Service Endpoints

Method	Endpoint	Description	Parameters
GET	`/`	Service status	None
POST	`/process`	Process document	`{"source": "url_or_path"}`
POST	`/ask`	Ask question	`{"question": "text", "source": "optional"}`
GET	`/summary`	Get summary	`?source=url` (optional)
GET	`/facts`	Get facts	`?source=url` (optional)
GET	`/docs`	List documents	None

API Usage Examples

Using curl:

# Check service status
curl http://localhost:8000/

# Process a document
curl -X POST http://localhost:8000/process \
     -H 'Content-Type: application/json' \
     -d '{"source": "https://en.wikipedia.org/wiki/Machine_learning"}'

# Ask a question
curl -X POST http://localhost:8000/ask \
     -H 'Content-Type: application/json' \
     -d '{"question": "What is machine learning?"}'

# Get summary
curl http://localhost:8000/summary

# Get facts
curl http://localhost:8000/facts

# List all processed documents
curl http://localhost:8000/docs

Using Python requests:

import requests

# Process a document
response = requests.post(
    "http://localhost:8000/process",
    json={"source": "https://en.wikipedia.org/wiki/Deep_learning"}
)
print(response.json())

# Ask a question
response = requests.post(
    "http://localhost:8000/ask",
    json={"question": "What are neural networks?"}
)
print(response.json())

# Get facts
response = requests.get("http://localhost:8000/facts")
print(response.json())

🔧 Configuration

Model Configuration

Both versions use these models by default:

Model ID: llama3.2:3b
Llama Stack URL: http://localhost:8321/

To change the model, edit the model_id parameter in the respective files.

Supported Document Types

✅ URLs: Any web page (extracted using readability)
✅ PDF files: Local or remote PDF documents
❌ Plain text files (can be added if needed)

🛠️ Troubleshooting

Common Issues

1. Connection Refused to Llama Stack

Error: Connection refused to http://localhost:8321/ Solution:

Ensure Llama Stack server is running
Check if port 8321 is correct
Verify network connectivity

2. Model Not Found

Error: Model not found: llama3.2:3b Solution:

Check available models: curl http://localhost:8321/models/list
Update model_id in the code to match available models

3. Ray Serve Port Already in Use

Error: Port 8000 already in use Solution:

# Kill process using port 8000
lsof -ti :8000 | xargs kill -9

# Or use a different port by modifying the code

4. Missing Dependencies

Error: ModuleNotFoundError: No module named 'ray' Solution:

pip install ray[serve] starlette

Debug Mode

To enable verbose logging, add this to the beginning of either file:

import logging
logging.basicConfig(level=logging.DEBUG)

📊 Performance Notes

CLI Version

Pros: Simple to use, interactive, good for testing
Cons: Single-threaded, session-based only
Best for: Development, testing, manual document analysis

Ray Serve Version

Pros: Concurrent requests, persistent service, API integration
Cons: More complex setup, requires Ray
Best for: Production, integration with other services, high throughput

🛑 Stopping Services

CLI Version

Press Ctrl+C or type quit in the interactive prompt

Ray Serve Version

Press Ctrl+C in the terminal running the service
The service will gracefully shutdown and clean up resources

📝 Examples

CLI Workflow

Start: python langchain_llamastack_updated.py
Load document: load https://arxiv.org/pdf/2103.00020.pdf
Get summary: summary
Ask questions: ask What are the main contributions?
Exit: quit

API Workflow

Start: python langchain_llamastack_ray.py
Process: curl -X POST http://localhost:8000/process -d '{"source": "https://example.com"}'
Query: curl -X POST http://localhost:8000/ask -d '{"question": "What is this about?"}'
Stop: Ctrl+C

🤝 Contributing

To extend functionality:

Add new prompt templates for different analysis types
Support additional document formats
Add caching for processed documents
Implement user authentication for API version

📜 License

This project is for educational and research purposes.

7.7 KiB Raw Blame History

LangChain + Llama Stack Document Processing

📋 Prerequisites

System Requirements

Required Python Packages

Environment Setup

🚀 Quick Start

Start Llama Stack Server

📖 Option 1: Interactive CLI Version (langchain_llamastack_updated.py)

Features

How to Run

Usage Commands

Example Session

🌐 Option 2: Ray Serve API Version (langchain_llamastack_ray.py)

Features

How to Run

Service Endpoints

API Usage Examples

Using curl:

Using Python requests:

🔧 Configuration

Model Configuration

Supported Document Types

🛠️ Troubleshooting

Common Issues

1. Connection Refused to Llama Stack

2. Model Not Found

3. Ray Serve Port Already in Use

4. Missing Dependencies

Debug Mode

📊 Performance Notes

CLI Version

Ray Serve Version

🛑 Stopping Services

CLI Version

Ray Serve Version

📝 Examples

CLI Workflow

API Workflow

🤝 Contributing

📜 License

7.7 KiB

Raw Blame History

📖 Option 1: Interactive CLI Version (`langchain_llamastack_updated.py`)

🌐 Option 2: Ray Serve API Version (`langchain_llamastack_ray.py`)