diff --git a/docs/notebooks/langChain/README.md b/docs/notebooks/langChain/README.md
deleted file mode 100644
index a6dbd2266..000000000
--- a/docs/notebooks/langChain/README.md
+++ /dev/null
@@ -1,233 +0,0 @@
-# LangChain + Llama Stack Document Processing
-
-1. **`langchain-llama-stack.py`** - Interactive CLI version
----
-
-## 📋 Prerequisites
-
-### System Requirements
-- Python 3.12+
-- Llama Stack server running on `http://localhost:8321/`
-- Ollama or compatible model server
-
-### Required Python Packages
-```bash
-pip install llama-stack-client langchain langchain-core langchain-community
-pip install beautifulsoup4 markdownify readability-lxml requests
-```
-
-### Environment Setup
-```bash
-# Create and activate virtual environment
-python3.12 -m venv llama-env-py312
-source llama-env-py312/bin/activate
-
-# Install dependencies
-pip install llama-stack-client langchain langchain-core langchain-community beautifulsoup4 markdownify readability-lxml requests
-```
-
----
-
-## 🚀 Quick Start
-
-### Start Llama Stack Server
-Before running either version, ensure your Llama Stack server is running:
-```bash
-# Start Llama Stack server (example)
-llama stack run your-config --port 8321
-```
-
----
-
-## 📖 Option 1: Interactive CLI Version (`langchain-llama-stack.py`)
-
-### Features
-- ✅ Interactive command-line interface
-- ✅ Document loading from URLs and PDFs
-- ✅ AI-powered summarization and fact extraction
-- ✅ Question-answering based on document content
-- ✅ Session-based document storage
-
-### How to Run
-```bash
-# Activate environment
-source llama-env-py312/bin/activate
-
-# Run the interactive CLI
-cd /home/omara/langchain_llamastack
-python langchain-llama-stack.py
-```
-
-### Usage Commands
-Once running, you can use these interactive commands:
-
-```
-🎯 Interactive Document Processing Demo
-Commands:
-  load <url_or_path>  - Process a document
-  ask <question>      - Ask about the document
-  summary            - Show document summary
-  facts              - Show extracted facts
-  help               - Show commands
-  quit               - Exit demo
-```
-
-### Example Session
-```
-> load https://en.wikipedia.org/wiki/Artificial_intelligence
-📄 Loading document from: https://en.wikipedia.org/wiki/Artificial_intelligence
-✅ Loaded 45,832 characters
-📝 Generating summary...
-🔍 Extracting key facts...
-✅ Processing complete!
-
-> summary
-📝 Summary:
-Artificial intelligence (AI) is the simulation of human intelligence...
-
-> ask What are the main types of AI?
-💬 Q: What are the main types of AI?
-📝 A: Based on the document, the main types of AI include...
-
-> facts
-🔍 Key Facts:
-- AI was founded as an academic discipline in 1956
-- Machine learning is a subset of AI...
-
-> quit
-👋 Thanks for exploring LangChain chains!
-```
-
-
-#### Using curl:
-```bash
-# Check service status
-curl http://localhost:8000/
-
-# Process a document
-curl -X POST http://localhost:8000/process \
-     -H 'Content-Type: application/json' \
-     -d '{"source": "https://en.wikipedia.org/wiki/Machine_learning"}'
-
-# Ask a question
-curl -X POST http://localhost:8000/ask \
-     -H 'Content-Type: application/json' \
-     -d '{"question": "What is machine learning?"}'
-
-# Get summary
-curl http://localhost:8000/summary
-
-# Get facts
-curl http://localhost:8000/facts
-
-# List all processed documents
-curl http://localhost:8000/docs
-```
-
-#### Using Python requests:
-```python
-import requests
-
-# Process a document
-response = requests.post(
-    "http://localhost:8000/process",
-    json={"source": "https://en.wikipedia.org/wiki/Deep_learning"}
-)
-print(response.json())
-
-# Ask a question
-response = requests.post(
-    "http://localhost:8000/ask",
-    json={"question": "What are neural networks?"}
-)
-print(response.json())
-
-# Get facts
-response = requests.get("http://localhost:8000/facts")
-print(response.json())
-```
-
----
-
-## 🔧 Configuration
-
-### Model Configuration
-Both versions use these models by default:
-- **Model ID**: `llama3.2:3b`
-- **Llama Stack URL**: `http://localhost:8321/`
-
-To change the model, edit the `model_id` parameter in the respective files.
-
-### Supported Document Types
-- ✅ **URLs**: Any web page (extracted using readability)
-- ✅ **PDF files**: Local or remote PDF documents
-- ❌ Plain text files (can be added if needed)
-
----
-
-## 🛠️ Troubleshooting
-
-### Common Issues
-
-#### 1. Connection Refused to Llama Stack
-**Error**: `Connection refused to http://localhost:8321/`
-**Solution**:
-- Ensure Llama Stack server is running
-- Check if port 8321 is correct
-- Verify network connectivity
-
-#### 2. Model Not Found
-**Error**: `Model not found: llama3.2:3b`
-**Solution**:
-- Check available models: `curl http://localhost:8321/models/list`
-- Update `model_id` in the code to match available models
-
-
-#### 4. Missing Dependencies
-### Debug Mode
-To enable verbose logging, add this to the beginning of either file:
-```python
-import logging
-logging.basicConfig(level=logging.DEBUG)
-```
-
----
-
-## 📊 Performance Notes
-
-### CLI Version
-- **Pros**: Simple to use, interactive, good for testing
-- **Cons**: Single-threaded, session-based only
-- **Best for**: Development, testing, manual document analysis
----
-
-## 🛑 Stopping Services
-
-### CLI Version
-- Press `Ctrl+C` or type `quit` in the interactive prompt
----
-
-## 📝 Examples
-
-### CLI Workflow
-1. Start: `python langchain-llama-stack.py`
-2. Load document: `load https://arxiv.org/pdf/2103.00020.pdf`
-3. Get summary: `summary`
-4. Ask questions: `ask What are the main contributions?`
-5. Exit: `quit`
-
----
-
-## 🤝 Contributing
-
-To extend functionality:
-1. Add new prompt templates for different analysis types
-2. Support additional document formats
-3. Add caching for processed documents
-4. Implement user authentication for API version
-
----
-
-## 📜 License
-
-This project is for educational and research purposes.
diff --git a/docs/notebooks/langChain/langchain-llama-stack.py b/docs/notebooks/langChain/langchain-llama-stack.py
deleted file mode 100644
index 98aaa8d6c..000000000
--- a/docs/notebooks/langChain/langchain-llama-stack.py
+++ /dev/null
@@ -1,288 +0,0 @@
-import os
-import re
-import html
-import requests
-from bs4 import BeautifulSoup
-from readability import Document as ReadabilityDocument
-from markdownify import markdownify
-from langchain_community.document_loaders import PyPDFLoader, TextLoader
-import tempfile
-
-from llama_stack_client import LlamaStackClient
-
-from langchain_core.language_models.llms import LLM
-from typing import Optional, List, Any
-from langchain.chains import LLMChain
-from langchain_core.prompts import PromptTemplate
-from rich.pretty import pprint
-
-# Global variables
-client = None
-llm = None
-summary_chain = None
-facts_chain = None
-qa_chain = None
-processed_docs = {}
-
-# Prompt Templates (defined globally)
-summary_template = PromptTemplate(
-    input_variables=["document"],
-    template="""Create a concise summary of this document in 5-10 sentences:
-
-{document}
-
-SUMMARY:"""
-)
-
-facts_template = PromptTemplate(
-    input_variables=["document"],
-    template="""Extract the most important facts from this document. List them as bullet points:
-
-{document}
-
-KEY FACTS:
--"""
-)
-
-qa_template = PromptTemplate(
-    input_variables=["document", "question"],
-    template="""Based on the following document, answer the question. If the answer isn't in the document, say so.
-
-DOCUMENT:
-{document}
-
-QUESTION: {question}
-
-ANSWER:"""
-)
-
-class LlamaStackLLM(LLM):
-    """Simple LangChain wrapper for Llama Stack"""
-
-    # Pydantic model fields
-    client: Any = None
-    model_id: str = "llama3:70b-instruct"
-
-    def __init__(self, client, model_id: str = "llama3:70b-instruct"):
-        # Initialize with field values
-        super().__init__(client=client, model_id=model_id)
-
-    def _call(self, prompt: str, stop: Optional[List[str]] = None, **kwargs) -> str:
-        """Make inference call to Llama Stack"""
-        response = self.client.inference.chat_completion(
-            model_id=self.model_id,
-            messages=[{"role": "user", "content": prompt}]
-        )
-        return response.completion_message.content
-
-    @property
-    def _llm_type(self) -> str:
-        return "llama_stack"
-
-
-def load_document(source: str) -> str:
-    is_url = source.startswith(('http://', 'https://'))
-    is_pdf = source.lower().endswith('.pdf')
-    if is_pdf:
-        return load_pdf(source, is_url=is_url)
-    elif is_url:
-        return load_from_url(source)
-    else:
-        raise ValueError(f"Unsupported format. Use URLs or PDF files.")
-
-
-def load_pdf(source: str, is_url: bool = False) -> str:
-    if is_url:
-        response = requests.get(source)
-        response.raise_for_status()
-        with tempfile.NamedTemporaryFile(delete=False, suffix=".pdf") as temp_file:
-            temp_file.write(response.content)
-            file_path = temp_file.name
-    else:
-        file_path = source
-    try:
-        loader = PyPDFLoader(file_path)
-        docs = loader.load()
-        return "\\n\\n".join([doc.page_content for doc in docs])
-    finally:
-        if is_url:
-            os.remove(file_path)
-
-
-def load_from_url(url: str) -> str:
-    headers = {'User-Agent': 'Mozilla/5.0 (compatible; DocumentLoader/1.0)'}
-    response = requests.get(url, headers=headers, timeout=15)
-    response.raise_for_status()
-    doc = ReadabilityDocument(response.text)
-    html_main = doc.summary(html_partial=True)
-    soup = BeautifulSoup(html_main, "html.parser")
-    for tag in soup(["script", "style", "noscript", "header", "footer", "nav", "aside"]):
-        tag.decompose()
-    md_text = markdownify(str(soup), heading_style="ATX")
-    md_text = html.unescape(md_text)
-    md_text = re.sub(r"\n{3,}", "\n\n", md_text).strip()
-    return md_text
-
-def process_document(source: str):
-    global summary_chain, facts_chain, processed_docs
-
-    print(f"📄 Loading document from: {source}")
-    document = load_document(source)
-    print(f"✅ Loaded {len(document):,} characters")
-    print("\n📝 Generating summary...")
-    summary = summary_chain.invoke({"document": document})["text"]
-    print("Summary generated")
-    print("🔍 Extracting key facts...")
-    facts = facts_chain.invoke({"document": document})["text"]
-    processed_docs[source] = {
-        "document": document,
-        "summary": summary,
-        "facts": facts
-    }
-    print(f"\n✅ Processing complete!")
-    print(f"📊 Document: {len(document):,} chars")
-    print(f"📝 Summary: {summary[:100]}...")
-    print(f"🔍 Facts: {facts[:1000]}...")
-    return processed_docs[source]
-
-def ask_question(question: str, source: str = None):
-    """Answer questions about processed documents"""
-    global qa_chain, processed_docs
-
-    if not processed_docs:
-        return "No documents processed yet. Use process_document() first."
-    if source and source in processed_docs:
-        doc_data = processed_docs[source]
-    else:
-        # Use the most recent document
-        doc_data = list(processed_docs.values())[-1]
-    answer = qa_chain.invoke({
-        "document": doc_data["document"],
-        "question": question
-    })["text"]
-    return answer
-
-
-def interactive_demo():
-    print("\n🎯 Interactive Document Processing Demo")
-    print("Commands:")
-    print("  load <url_or_path>  - Process a document")
-    print("  ask <question>      - Ask about the document")
-    print("  summary            - Show document summary")
-    print("  facts              - Show extracted facts")
-    print("  help               - Show commands")
-    print("  quit               - Exit demo")
-
-    while True:
-        try:
-            command = input("\n> ").strip()
-            if command.lower() in ['quit', 'exit']:
-                print("👋 Thanks for exploring LangChain chains!")
-                break
-            elif command.lower() == 'help':
-                print("\nCommands:")
-                print("  load <url_or_path>  - Process a document")
-                print("  ask <question>      - Ask about the document")
-                print("  summary            - Show document summary")
-                print("  facts              - Show extracted facts")
-            elif command.startswith('load '):
-                source = command[5:].strip()
-                if source:
-                    try:
-                        process_document(source)
-                    except Exception as e:
-                        print(f"❌ Error processing document: {e}")
-                else:
-                    print("❓ Please provide a URL or file path")
-            elif command.startswith('ask '):
-                question = command[4:].strip()
-                if question:
-                    try:
-                        answer = ask_question(question)
-                        print(f"\n💬 Q: {question}")
-                        print(f"📝 A: {answer}")
-                    except Exception as e:
-                        print(f"❌ Error: {e}")
-                else:
-                    print("❓ Please provide a question")
-            elif command.lower() == 'summary':
-                if processed_docs:
-                    latest_doc = list(processed_docs.values())[-1]
-                    print(f"\n📝 Summary:\n{latest_doc['summary']}")
-                else:
-                    print("❓ No documents processed yet")
-            elif command.lower() == 'facts':
-                if processed_docs:
-                    latest_doc = list(processed_docs.values())[-1]
-                    print(f"\n🔍 Key Facts:\n{latest_doc['facts']}")
-                else:
-                    print("❓ No documents processed yet")
-            else:
-                print("❓ Unknown command. Type 'help' for options")
-        except (EOFError, KeyboardInterrupt):
-            print("\n👋 Goodbye!")
-            break
-
-
-def main():
-    global client, llm, summary_chain, facts_chain, qa_chain, processed_docs
-
-    print("🚀 Starting LangChain + Llama Stack Document Processing Demo")
-
-    client = LlamaStackClient(
-        base_url="http://localhost:8321/",
-    )
-
-    # Initialize the LangChain-compatible LLM
-    llm = LlamaStackLLM(client)
-
-    # Test the wrapper
-    test_response = llm.invoke("Can you help me with the document processing?")
-    print(f"✅ LangChain wrapper working!")
-    print(f"Response: {test_response[:100]}...")
-
-    print("Available models:")
-    for m in client.models.list():
-        print(f"- {m.identifier}")
-
-    print("----")
-    print("Available shields (safety models):")
-    for s in client.shields.list():
-        print(s.identifier)
-    print("----")
-
-    # model_id = "llama3.2:3b"
-    model_id = "ollama/llama3:70b-instruct"
-
-    response = client.inference.chat_completion(
-        model_id=model_id,
-        messages=[
-            {"role": "system", "content": "You are a friendly assistant."},
-            {"role": "user", "content": "Write a two-sentence poem about llama."},
-        ],
-    )
-
-    print(response.completion_message.content)
-
-    # Create chains by combining our LLM with prompt templates
-    summary_chain = LLMChain(llm=llm, prompt=summary_template)
-    facts_chain = LLMChain(llm=llm, prompt=facts_template)
-    qa_chain = LLMChain(llm=llm, prompt=qa_template)
-
-    # Initialize storage for processed documents
-    processed_docs = {}
-
-    print("✅ Created 3 prompt templates:")
-    print("  • Summary: Condenses documents into key points")
-    print("  • Facts: Extracts important information as bullets")
-    print("  • Q&A: Answers questions based on document content")
-
-    # Test template formatting
-    test_prompt = summary_template.format(document="This is a sample document about AI...")
-    print(f"\n📝 Example prompt: {len(test_prompt)} characters")
-
-    # Start the interactive demo
-    interactive_demo()
-
-if __name__ == "__main__":
-    main()