This commit is contained in:
slekkala1 2025-10-03 09:35:34 -07:00 committed by GitHub
commit 0e871aaa55
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -0,0 +1,365 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "notebook-title",
"metadata": {},
"source": [
"# OpenAI Client with LLAMA Stack Extensions\n",
"\n",
"This notebook demonstrates how to use the **OpenAI Python client** with **LLAMA Stack server extensions**, allowing you to access LLAMA Stack-specific APIs through familiar OpenAI client patterns.\n",
"\n",
"## What You'll Learn\n",
"\n",
"1. 🔌 **Connect to LLAMA Stack** using the OpenAI client with a custom base URL\n",
"2. 🗄️ **Create and manage vector databases** using LLAMA Stack's vector-db API\n",
"3. 📄 **Insert and query vector data** for semantic search capabilities\n",
"4. 🌐 **Use low-level HTTP requests** to access LLAMA Stack-specific endpoints\n",
"\n",
"## Prerequisites\n",
"\n",
"- ✅ LLAMA Stack server running on `localhost:8321`\n",
"- ✅ Python packages: `pip install openai llama-stack-client`"
]
},
{
"cell_type": "markdown",
"id": "setup-section",
"metadata": {},
"source": [
"## 🔧 Setup: Connect OpenAI Client to LLAMA Stack\n",
"\n",
"We'll use the OpenAI client but point it to our local LLAMA Stack server instead of OpenAI's servers."
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "a42d6950-65c2-445e-96f9-6301d36d3a0f",
"metadata": {},
"outputs": [],
"source": [
"from openai import OpenAI\n",
"\n",
"client = OpenAI(base_url=\"http://localhost:8321/v1\", api_key=\"dummy-key\")"
]
},
{
"cell_type": "markdown",
"id": "vector-db-section",
"metadata": {},
"source": [
"## 🗄️ Create a Vector Database\n",
"\n",
"The code above creates a vector database using LLAMA Stack's vector-db API. We're using:\n",
"- **FAISS** as the backend provider\n",
"- **sentence-transformers/all-MiniLM-L6-v2** for embeddings (384 dimensions)\n",
"- A unique identifier `acme_docs_v2` for this database"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "e45da4ef-03d9-48f7-a48f-8b5c09f2a46f",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:httpx:HTTP Request: POST http://localhost:8321/v1/vector-dbs \"HTTP/1.1 200 OK\"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'identifier': 'acme_docs_v2', 'provider_resource_id': 'acme_docs_v2', 'provider_id': 'faiss', 'type': 'vector_db', 'owner': None, 'source': 'via_register_api', 'embedding_model': 'sentence-transformers/all-MiniLM-L6-v2', 'embedding_dimension': 384, 'vector_db_name': 'acme_docs'}\n"
]
}
],
"source": [
"# Using a low-level request for llama-stack specific API\n",
"resp = client._client.request(\n",
" \"POST\",\n",
" \"/vector-dbs\",\n",
" json={\n",
" \"vector_db_id\": \"acme_docs\", # Use a new unique name\n",
" \"provider_id\": \"faiss\",\n",
" \"provider_vector_db_id\": \"acme_docs_v2\",\n",
" \"embedding_model\": \"sentence-transformers/all-MiniLM-L6-v2\",\n",
" \"embedding_dimension\": 384,\n",
" },\n",
")\n",
"\n",
"print(resp.json())"
]
},
{
"cell_type": "markdown",
"id": "list-dbs-section",
"metadata": {},
"source": [
"## 📋 List All Vector Databases\n",
"\n",
"This lists all vector databases registered in the LLAMA Stack server, allowing us to verify our database was created successfully."
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "94e3ca3c-3a8a-4e95-91a6-e87aa4221629",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:httpx:HTTP Request: GET http://localhost:8321/v1/vector-dbs \"HTTP/1.1 200 OK\"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'data': [{'identifier': 'acme_docs_v2', 'provider_resource_id': 'acme_docs_v2', 'provider_id': 'faiss', 'type': 'vector_db', 'embedding_model': 'sentence-transformers/all-MiniLM-L6-v2', 'embedding_dimension': 384, 'vector_db_name': 'acme_docs'}]}\n"
]
}
],
"source": [
"resp = client._client.request(\n",
" \"GET\",\n",
" \"/vector-dbs\"\n",
")\n",
"\n",
"print(resp.json())"
]
},
{
"cell_type": "markdown",
"id": "get-db-section",
"metadata": {},
"source": [
"## 🔍 Retrieve Specific Database Info\n",
"\n",
"Get detailed information about our specific vector database, including its configuration and metadata."
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "e67a8ba3-7785-4087-b0c8-442596fbdf92",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:httpx:HTTP Request: GET http://localhost:8321/v1/vector-dbs/acme_docs_v2 \"HTTP/1.1 200 OK\"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'identifier': 'acme_docs_v2', 'provider_resource_id': 'acme_docs_v2', 'provider_id': 'faiss', 'type': 'vector_db', 'owner': None, 'source': 'via_register_api', 'embedding_model': 'sentence-transformers/all-MiniLM-L6-v2', 'embedding_dimension': 384, 'vector_db_name': 'acme_docs'}\n"
]
}
],
"source": [
"resp = client._client.request(\n",
" \"GET\",\n",
" \"/vector-dbs/acme_docs_v2\"\n",
")\n",
"\n",
"print(resp.json())"
]
},
{
"cell_type": "markdown",
"id": "prepare-data-section",
"metadata": {},
"source": [
"## 📄 Prepare Documents for Vector Storage\n",
"\n",
"We create sample company policy documents and convert them into **Chunk** objects that LLAMA Stack can process. Each chunk contains:\n",
"- **content**: The actual text content\n",
"- **metadata**: Searchable metadata with a `document_id`"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "3f08461a-7f9b-439e-bd38-72b68ee9a430",
"metadata": {},
"outputs": [],
"source": [
"from llama_stack_client.types.vector_io_insert_params import Chunk\n",
"\n",
"docs = [\n",
" (\"Acme ships globally in 3-5 business days.\", {\"title\": \"Shipping Policy\"}),\n",
" (\"Returns are accepted within 30 days of purchase.\", {\"title\": \"Returns Policy\"}),\n",
" (\"Support is available 24/7 via chat and email.\", {\"title\": \"Support\"}),\n",
"]\n",
"\n",
"# Convert to Chunk objects\n",
"chunks = []\n",
"for _, (content, metadata) in enumerate(docs):\n",
" # Transform metadata to required format with document_id from title\n",
" metadata = {\"document_id\": metadata[\"title\"]}\n",
" chunk = Chunk(\n",
" content=content, # Required[InterleavedContent]\n",
" metadata=metadata, # Required[Dict]\n",
" )\n",
" chunks.append(chunk)"
]
},
{
"cell_type": "markdown",
"id": "insert-data-section",
"metadata": {},
"source": [
"## 📤 Insert Documents into Vector Database\n",
"\n",
"Insert our prepared chunks into the vector database. LLAMA Stack will automatically:\n",
"- Generate embeddings using the specified model\n",
"- Store the vectors in the FAISS index\n",
"- Set a TTL (time-to-live) of 1 hour for the chunks"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "0fb7f7a7-6b8e-4af4-af93-ecc4fb3e1696",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:httpx:HTTP Request: POST http://localhost:8321/v1/vector-io/insert \"HTTP/1.1 200 OK\"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"200\n",
"None\n"
]
}
],
"source": [
"resp = client._client.request(\n",
" \"POST\",\n",
" \"/vector-io/insert\",\n",
" json={\n",
" \"vector_db_id\": \"acme_docs_v2\",\n",
" \"chunks\": chunks,\n",
" \"ttl_seconds\": 3600, # optional\n",
" }\n",
")\n",
"\n",
"print(resp.status_code)\n",
"print(resp.json()) # might be empty if API returns None"
]
},
{
"cell_type": "markdown",
"id": "search-section",
"metadata": {},
"source": [
"## 🔍 Semantic Search Query\n",
"\n",
"Perform a **semantic search** on our documents. The query \"How long does Acme take to ship orders?\" will be converted to an embedding and matched against stored document embeddings to find the most relevant content.\n",
"\n",
"The results show the most relevant chunks ranked by semantic similarity, with metadata and content for each match."
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "faa7e0ab-aef9-496f-a2bc-90b4eb8bc860",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:httpx:HTTP Request: POST http://localhost:8321/v1/vector-io/query \"HTTP/1.1 200 OK\"\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Chunk 1:\n",
" Metadata: {'document_id': 'Shipping Policy'}\n",
" Content : Acme ships globally in 3-5 business days.\n",
"--------------------------------------------------\n",
"Chunk 2:\n",
" Metadata: {'document_id': 'Shipping Policy'}\n",
" Content : Acme ships globally in 3-5 business days.\n",
"--------------------------------------------------\n",
"Chunk 3:\n",
" Metadata: {'document_id': 'Returns Policy'}\n",
" Content : Returns are accepted within 30 days of purchase.\n",
"--------------------------------------------------\n"
]
}
],
"source": [
"query = \"How long does Acme take to ship orders?\"\n",
"\n",
"resp = client._client.request(\n",
" \"POST\",\n",
" \"/vector-io/query\", # endpoint for vector queries\n",
" json={\n",
" \"vector_db_id\": \"acme_docs_v2\",\n",
" \"query\": query,\n",
" \"top_k\": 5 # optional, number of results to return\n",
" }\n",
")\n",
"\n",
"# Convert response to Python dictionary\n",
"data = resp.json()\n",
"\n",
"# Loop through returned chunks\n",
"for i, chunk in enumerate(data.get(\"chunks\", []), start=1):\n",
" print(f\"Chunk {i}:\")\n",
" print(\" Metadata:\", chunk.get(\"metadata\"))\n",
" print(\" Content :\", chunk.get(\"content\"))\n",
" print(\"-\" * 50)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aa803d10-8f3f-40e0-b621-51c847778e05",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}