From 26f8c86dd899f5f49cf0d9fef74780f191047fd2 Mon Sep 17 00:00:00 2001
From: Swapna Lekkala <swapna942@meta.com>
Date: Thu, 2 Oct 2025 11:38:10 -0700
Subject: [PATCH 1/2] Add Open AI client example for LLAMAExtensions

---
 ...enAIClient_with_LLAMAStackExtensions.ipynb | 270 ++++++++++++++++++
 1 file changed, 270 insertions(+)
 create mode 100644 docs/notebooks/OpenAIClient_with_LLAMAStackExtensions.ipynb

diff --git a/docs/notebooks/OpenAIClient_with_LLAMAStackExtensions.ipynb b/docs/notebooks/OpenAIClient_with_LLAMAStackExtensions.ipynb
new file mode 100644
index 000000000..d6841e488
--- /dev/null
+++ b/docs/notebooks/OpenAIClient_with_LLAMAStackExtensions.ipynb
@@ -0,0 +1,270 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "notebook-title",
+   "metadata": {},
+   "source": [
+    "# OpenAI Client with LLAMA Stack Extensions\n",
+    "\n",
+    "This notebook demonstrates how to use the **OpenAI Python client** with **LLAMA Stack server extensions**, allowing you to access LLAMA Stack-specific APIs through familiar OpenAI client patterns.\n",
+    "\n",
+    "## What You'll Learn\n",
+    "\n",
+    "1. 🔌 **Connect to LLAMA Stack** using the OpenAI client with a custom base URL\n",
+    "2. 🗄️ **Create and manage vector databases** using LLAMA Stack's vector-db API\n",
+    "3. 📄 **Insert and query vector data** for semantic search capabilities\n",
+    "4. 🌐 **Use low-level HTTP requests** to access LLAMA Stack-specific endpoints\n",
+    "\n",
+    "## Prerequisites\n",
+    "\n",
+    "- ✅ LLAMA Stack server running on `localhost:8321`\n",
+    "- ✅ Python packages: `pip install openai llama-stack-client`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "setup-section",
+   "metadata": {},
+   "source": [
+    "## 🔧 Setup: Connect OpenAI Client to LLAMA Stack\n",
+    "\n",
+    "We'll use the OpenAI client but point it to our local LLAMA Stack server instead of OpenAI's servers."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a42d6950-65c2-445e-96f9-6301d36d3a0f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from openai import OpenAI\n",
+    "\n",
+    "client = OpenAI(base_url=\"http://localhost:8321/v1\", api_key=\"dummy-key\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "vector-db-section",
+   "metadata": {},
+   "source": [
+    "## 🗄️ Create a Vector Database\n",
+    "\n",
+    "The code above creates a vector database using LLAMA Stack's vector-db API. We're using:\n",
+    "- **FAISS** as the backend provider\n",
+    "- **sentence-transformers/all-MiniLM-L6-v2** for embeddings (384 dimensions)\n",
+    "- A unique identifier `acme_docs_v2` for this database"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e45da4ef-03d9-48f7-a48f-8b5c09f2a46f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Using a low-level request for llama-stack specific API\n",
+    "resp = client._client.request(\n",
+    "    \"POST\",\n",
+    "    \"/vector-dbs\",\n",
+    "    json={\n",
+    "        \"vector_db_id\": \"acme_docs\",  # Use a new unique name\n",
+    "        \"provider_id\": \"faiss\",\n",
+    "        \"provider_vector_db_id\": \"acme_docs_v2\",\n",
+    "        \"embedding_model\": \"sentence-transformers/all-MiniLM-L6-v2\",\n",
+    "        \"embedding_dimension\": 384,\n",
+    "    },\n",
+    ")\n",
+    "\n",
+    "print(resp.json())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "list-dbs-section",
+   "metadata": {},
+   "source": [
+    "## 📋 List All Vector Databases\n",
+    "\n",
+    "This lists all vector databases registered in the LLAMA Stack server, allowing us to verify our database was created successfully."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "94e3ca3c-3a8a-4e95-91a6-e87aa4221629",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "resp = client._client.request(\n",
+    "    \"GET\",\n",
+    "    \"/vector-dbs\"\n",
+    ")\n",
+    "\n",
+    "print(resp.json())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "get-db-section",
+   "metadata": {},
+   "source": [
+    "## 🔍 Retrieve Specific Database Info\n",
+    "\n",
+    "Get detailed information about our specific vector database, including its configuration and metadata."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e67a8ba3-7785-4087-b0c8-442596fbdf92",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "resp = client._client.request(\n",
+    "    \"GET\",\n",
+    "    \"/vector-dbs/acme_docs_v2\"\n",
+    ")\n",
+    "\n",
+    "print(resp.json())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "prepare-data-section",
+   "metadata": {},
+   "source": [
+    "## 📄 Prepare Documents for Vector Storage\n",
+    "\n",
+    "We create sample company policy documents and convert them into **Chunk** objects that LLAMA Stack can process. Each chunk contains:\n",
+    "- **content**: The actual text content\n",
+    "- **metadata**: Searchable metadata with a `document_id`"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3f08461a-7f9b-439e-bd38-72b68ee9a430",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from llama_stack_client.types.vector_io_insert_params import Chunk\n",
+    "\n",
+    "docs = [\n",
+    "    (\"Acme ships globally in 3-5 business days.\", {\"title\": \"Shipping Policy\"}),\n",
+    "    (\"Returns are accepted within 30 days of purchase.\", {\"title\": \"Returns Policy\"}),\n",
+    "    (\"Support is available 24/7 via chat and email.\", {\"title\": \"Support\"}),\n",
+    "]\n",
+    "\n",
+    "# Convert to Chunk objects\n",
+    "chunks = []\n",
+    "for _, (content, metadata) in enumerate(docs):\n",
+    "    # Transform metadata to required format with document_id from title\n",
+    "    metadata = {\"document_id\": metadata[\"title\"]}\n",
+    "    chunk = Chunk(\n",
+    "        content=content,  # Required[InterleavedContent]\n",
+    "        metadata=metadata,  # Required[Dict]\n",
+    "    )\n",
+    "    chunks.append(chunk)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "insert-data-section",
+   "metadata": {},
+   "source": [
+    "## 📤 Insert Documents into Vector Database\n",
+    "\n",
+    "Insert our prepared chunks into the vector database. LLAMA Stack will automatically:\n",
+    "- Generate embeddings using the specified model\n",
+    "- Store the vectors in the FAISS index\n",
+    "- Set a TTL (time-to-live) of 1 hour for the chunks"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0fb7f7a7-6b8e-4af4-af93-ecc4fb3e1696",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "resp = client._client.request(\n",
+    "    \"POST\",\n",
+    "    \"/vector-io/insert\",\n",
+    "    json={\n",
+    "        \"vector_db_id\": \"acme_docs_v2\",\n",
+    "        \"chunks\": chunks,\n",
+    "        \"ttl_seconds\": 3600,  # optional\n",
+    "    }\n",
+    ")\n",
+    "\n",
+    "print(resp.status_code)\n",
+    "print(resp.json())  # might be empty if API returns None"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "search-section",
+   "metadata": {},
+   "source": [
+    "## 🔍 Semantic Search Query\n",
+    "\n",
+    "Perform a **semantic search** on our documents. The query \"How long does Acme take to ship orders?\" will be converted to an embedding and matched against stored document embeddings to find the most relevant content.\n",
+    "\n",
+    "The results show the most relevant chunks ranked by semantic similarity, with metadata and content for each match."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "faa7e0ab-aef9-496f-a2bc-90b4eb8bc860",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "query = \"How long does Acme take to ship orders?\"\n",
+    "\n",
+    "resp = client._client.request(\n",
+    "    \"POST\",\n",
+    "    \"/vector-io/query\",   # endpoint for vector queries\n",
+    "    json={\n",
+    "        \"vector_db_id\": \"acme_docs_v2\",\n",
+    "        \"query\": query,\n",
+    "        \"top_k\": 5  # optional, number of results to return\n",
+    "    }\n",
+    ")\n",
+    "\n",
+    "# Convert response to Python dictionary\n",
+    "data = resp.json()\n",
+    "\n",
+    "# Loop through returned chunks\n",
+    "for i, chunk in enumerate(data.get(\"chunks\", []), start=1):\n",
+    "    print(f\"Chunk {i}:\")\n",
+    "    print(\"  Metadata:\", chunk.get(\"metadata\"))\n",
+    "    print(\"  Content :\", chunk.get(\"content\"))\n",
+    "    print(\"-\" * 50)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.13.7"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}

From 0123079a91e2f5bbab9f1653eac87cfd2260ca00 Mon Sep 17 00:00:00 2001
From: Swapna Lekkala <swapna942@meta.com>
Date: Thu, 2 Oct 2025 11:39:05 -0700
Subject: [PATCH 2/2] fix

---
 ...enAIClient_with_LLAMAStackExtensions.ipynb | 119 ++++++++++++++++--
 1 file changed, 107 insertions(+), 12 deletions(-)

diff --git a/docs/notebooks/OpenAIClient_with_LLAMAStackExtensions.ipynb b/docs/notebooks/OpenAIClient_with_LLAMAStackExtensions.ipynb
index d6841e488..62568d5a8 100644
--- a/docs/notebooks/OpenAIClient_with_LLAMAStackExtensions.ipynb
+++ b/docs/notebooks/OpenAIClient_with_LLAMAStackExtensions.ipynb
@@ -34,7 +34,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 25,
    "id": "a42d6950-65c2-445e-96f9-6301d36d3a0f",
    "metadata": {},
    "outputs": [],
@@ -59,10 +59,25 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 26,
    "id": "e45da4ef-03d9-48f7-a48f-8b5c09f2a46f",
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "INFO:httpx:HTTP Request: POST http://localhost:8321/v1/vector-dbs \"HTTP/1.1 200 OK\"\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'identifier': 'acme_docs_v2', 'provider_resource_id': 'acme_docs_v2', 'provider_id': 'faiss', 'type': 'vector_db', 'owner': None, 'source': 'via_register_api', 'embedding_model': 'sentence-transformers/all-MiniLM-L6-v2', 'embedding_dimension': 384, 'vector_db_name': 'acme_docs'}\n"
+     ]
+    }
+   ],
    "source": [
     "# Using a low-level request for llama-stack specific API\n",
     "resp = client._client.request(\n",
@@ -92,10 +107,25 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 27,
    "id": "94e3ca3c-3a8a-4e95-91a6-e87aa4221629",
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "INFO:httpx:HTTP Request: GET http://localhost:8321/v1/vector-dbs \"HTTP/1.1 200 OK\"\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'data': [{'identifier': 'acme_docs_v2', 'provider_resource_id': 'acme_docs_v2', 'provider_id': 'faiss', 'type': 'vector_db', 'embedding_model': 'sentence-transformers/all-MiniLM-L6-v2', 'embedding_dimension': 384, 'vector_db_name': 'acme_docs'}]}\n"
+     ]
+    }
+   ],
    "source": [
     "resp = client._client.request(\n",
     "    \"GET\",\n",
@@ -117,10 +147,25 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 28,
    "id": "e67a8ba3-7785-4087-b0c8-442596fbdf92",
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "INFO:httpx:HTTP Request: GET http://localhost:8321/v1/vector-dbs/acme_docs_v2 \"HTTP/1.1 200 OK\"\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'identifier': 'acme_docs_v2', 'provider_resource_id': 'acme_docs_v2', 'provider_id': 'faiss', 'type': 'vector_db', 'owner': None, 'source': 'via_register_api', 'embedding_model': 'sentence-transformers/all-MiniLM-L6-v2', 'embedding_dimension': 384, 'vector_db_name': 'acme_docs'}\n"
+     ]
+    }
+   ],
    "source": [
     "resp = client._client.request(\n",
     "    \"GET\",\n",
@@ -144,7 +189,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 29,
    "id": "3f08461a-7f9b-439e-bd38-72b68ee9a430",
    "metadata": {},
    "outputs": [],
@@ -184,10 +229,26 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 30,
    "id": "0fb7f7a7-6b8e-4af4-af93-ecc4fb3e1696",
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "INFO:httpx:HTTP Request: POST http://localhost:8321/v1/vector-io/insert \"HTTP/1.1 200 OK\"\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "200\n",
+      "None\n"
+     ]
+    }
+   ],
    "source": [
     "resp = client._client.request(\n",
     "    \"POST\",\n",
@@ -217,10 +278,36 @@
   },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 31,
    "id": "faa7e0ab-aef9-496f-a2bc-90b4eb8bc860",
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "INFO:httpx:HTTP Request: POST http://localhost:8321/v1/vector-io/query \"HTTP/1.1 200 OK\"\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Chunk 1:\n",
+      "  Metadata: {'document_id': 'Shipping Policy'}\n",
+      "  Content : Acme ships globally in 3-5 business days.\n",
+      "--------------------------------------------------\n",
+      "Chunk 2:\n",
+      "  Metadata: {'document_id': 'Shipping Policy'}\n",
+      "  Content : Acme ships globally in 3-5 business days.\n",
+      "--------------------------------------------------\n",
+      "Chunk 3:\n",
+      "  Metadata: {'document_id': 'Returns Policy'}\n",
+      "  Content : Returns are accepted within 30 days of purchase.\n",
+      "--------------------------------------------------\n"
+     ]
+    }
+   ],
    "source": [
     "query = \"How long does Acme take to ship orders?\"\n",
     "\n",
@@ -244,6 +331,14 @@
     "    print(\"  Content :\", chunk.get(\"content\"))\n",
     "    print(\"-\" * 50)"
    ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "aa803d10-8f3f-40e0-b621-51c847778e05",
+   "metadata": {},
+   "outputs": [],
+   "source": []
   }
  ],
  "metadata": {