mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-03 19:57:35 +00:00
Merge 0123079a91
into ce77c27ff8
This commit is contained in:
commit
0e871aaa55
1 changed files with 365 additions and 0 deletions
365
docs/notebooks/OpenAIClient_with_LLAMAStackExtensions.ipynb
Normal file
365
docs/notebooks/OpenAIClient_with_LLAMAStackExtensions.ipynb
Normal file
|
@ -0,0 +1,365 @@
|
||||||
|
{
|
||||||
|
"cells": [
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "notebook-title",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"# OpenAI Client with LLAMA Stack Extensions\n",
|
||||||
|
"\n",
|
||||||
|
"This notebook demonstrates how to use the **OpenAI Python client** with **LLAMA Stack server extensions**, allowing you to access LLAMA Stack-specific APIs through familiar OpenAI client patterns.\n",
|
||||||
|
"\n",
|
||||||
|
"## What You'll Learn\n",
|
||||||
|
"\n",
|
||||||
|
"1. 🔌 **Connect to LLAMA Stack** using the OpenAI client with a custom base URL\n",
|
||||||
|
"2. 🗄️ **Create and manage vector databases** using LLAMA Stack's vector-db API\n",
|
||||||
|
"3. 📄 **Insert and query vector data** for semantic search capabilities\n",
|
||||||
|
"4. 🌐 **Use low-level HTTP requests** to access LLAMA Stack-specific endpoints\n",
|
||||||
|
"\n",
|
||||||
|
"## Prerequisites\n",
|
||||||
|
"\n",
|
||||||
|
"- ✅ LLAMA Stack server running on `localhost:8321`\n",
|
||||||
|
"- ✅ Python packages: `pip install openai llama-stack-client`"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "setup-section",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 🔧 Setup: Connect OpenAI Client to LLAMA Stack\n",
|
||||||
|
"\n",
|
||||||
|
"We'll use the OpenAI client but point it to our local LLAMA Stack server instead of OpenAI's servers."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 25,
|
||||||
|
"id": "a42d6950-65c2-445e-96f9-6301d36d3a0f",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from openai import OpenAI\n",
|
||||||
|
"\n",
|
||||||
|
"client = OpenAI(base_url=\"http://localhost:8321/v1\", api_key=\"dummy-key\")"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "vector-db-section",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 🗄️ Create a Vector Database\n",
|
||||||
|
"\n",
|
||||||
|
"The code above creates a vector database using LLAMA Stack's vector-db API. We're using:\n",
|
||||||
|
"- **FAISS** as the backend provider\n",
|
||||||
|
"- **sentence-transformers/all-MiniLM-L6-v2** for embeddings (384 dimensions)\n",
|
||||||
|
"- A unique identifier `acme_docs_v2` for this database"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 26,
|
||||||
|
"id": "e45da4ef-03d9-48f7-a48f-8b5c09f2a46f",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stderr",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"INFO:httpx:HTTP Request: POST http://localhost:8321/v1/vector-dbs \"HTTP/1.1 200 OK\"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"{'identifier': 'acme_docs_v2', 'provider_resource_id': 'acme_docs_v2', 'provider_id': 'faiss', 'type': 'vector_db', 'owner': None, 'source': 'via_register_api', 'embedding_model': 'sentence-transformers/all-MiniLM-L6-v2', 'embedding_dimension': 384, 'vector_db_name': 'acme_docs'}\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"# Using a low-level request for llama-stack specific API\n",
|
||||||
|
"resp = client._client.request(\n",
|
||||||
|
" \"POST\",\n",
|
||||||
|
" \"/vector-dbs\",\n",
|
||||||
|
" json={\n",
|
||||||
|
" \"vector_db_id\": \"acme_docs\", # Use a new unique name\n",
|
||||||
|
" \"provider_id\": \"faiss\",\n",
|
||||||
|
" \"provider_vector_db_id\": \"acme_docs_v2\",\n",
|
||||||
|
" \"embedding_model\": \"sentence-transformers/all-MiniLM-L6-v2\",\n",
|
||||||
|
" \"embedding_dimension\": 384,\n",
|
||||||
|
" },\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"print(resp.json())"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "list-dbs-section",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 📋 List All Vector Databases\n",
|
||||||
|
"\n",
|
||||||
|
"This lists all vector databases registered in the LLAMA Stack server, allowing us to verify our database was created successfully."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 27,
|
||||||
|
"id": "94e3ca3c-3a8a-4e95-91a6-e87aa4221629",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stderr",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"INFO:httpx:HTTP Request: GET http://localhost:8321/v1/vector-dbs \"HTTP/1.1 200 OK\"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"{'data': [{'identifier': 'acme_docs_v2', 'provider_resource_id': 'acme_docs_v2', 'provider_id': 'faiss', 'type': 'vector_db', 'embedding_model': 'sentence-transformers/all-MiniLM-L6-v2', 'embedding_dimension': 384, 'vector_db_name': 'acme_docs'}]}\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"resp = client._client.request(\n",
|
||||||
|
" \"GET\",\n",
|
||||||
|
" \"/vector-dbs\"\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"print(resp.json())"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "get-db-section",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 🔍 Retrieve Specific Database Info\n",
|
||||||
|
"\n",
|
||||||
|
"Get detailed information about our specific vector database, including its configuration and metadata."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 28,
|
||||||
|
"id": "e67a8ba3-7785-4087-b0c8-442596fbdf92",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stderr",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"INFO:httpx:HTTP Request: GET http://localhost:8321/v1/vector-dbs/acme_docs_v2 \"HTTP/1.1 200 OK\"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"{'identifier': 'acme_docs_v2', 'provider_resource_id': 'acme_docs_v2', 'provider_id': 'faiss', 'type': 'vector_db', 'owner': None, 'source': 'via_register_api', 'embedding_model': 'sentence-transformers/all-MiniLM-L6-v2', 'embedding_dimension': 384, 'vector_db_name': 'acme_docs'}\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"resp = client._client.request(\n",
|
||||||
|
" \"GET\",\n",
|
||||||
|
" \"/vector-dbs/acme_docs_v2\"\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"print(resp.json())"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "prepare-data-section",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 📄 Prepare Documents for Vector Storage\n",
|
||||||
|
"\n",
|
||||||
|
"We create sample company policy documents and convert them into **Chunk** objects that LLAMA Stack can process. Each chunk contains:\n",
|
||||||
|
"- **content**: The actual text content\n",
|
||||||
|
"- **metadata**: Searchable metadata with a `document_id`"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 29,
|
||||||
|
"id": "3f08461a-7f9b-439e-bd38-72b68ee9a430",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"from llama_stack_client.types.vector_io_insert_params import Chunk\n",
|
||||||
|
"\n",
|
||||||
|
"docs = [\n",
|
||||||
|
" (\"Acme ships globally in 3-5 business days.\", {\"title\": \"Shipping Policy\"}),\n",
|
||||||
|
" (\"Returns are accepted within 30 days of purchase.\", {\"title\": \"Returns Policy\"}),\n",
|
||||||
|
" (\"Support is available 24/7 via chat and email.\", {\"title\": \"Support\"}),\n",
|
||||||
|
"]\n",
|
||||||
|
"\n",
|
||||||
|
"# Convert to Chunk objects\n",
|
||||||
|
"chunks = []\n",
|
||||||
|
"for _, (content, metadata) in enumerate(docs):\n",
|
||||||
|
" # Transform metadata to required format with document_id from title\n",
|
||||||
|
" metadata = {\"document_id\": metadata[\"title\"]}\n",
|
||||||
|
" chunk = Chunk(\n",
|
||||||
|
" content=content, # Required[InterleavedContent]\n",
|
||||||
|
" metadata=metadata, # Required[Dict]\n",
|
||||||
|
" )\n",
|
||||||
|
" chunks.append(chunk)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "insert-data-section",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 📤 Insert Documents into Vector Database\n",
|
||||||
|
"\n",
|
||||||
|
"Insert our prepared chunks into the vector database. LLAMA Stack will automatically:\n",
|
||||||
|
"- Generate embeddings using the specified model\n",
|
||||||
|
"- Store the vectors in the FAISS index\n",
|
||||||
|
"- Set a TTL (time-to-live) of 1 hour for the chunks"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 30,
|
||||||
|
"id": "0fb7f7a7-6b8e-4af4-af93-ecc4fb3e1696",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stderr",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"INFO:httpx:HTTP Request: POST http://localhost:8321/v1/vector-io/insert \"HTTP/1.1 200 OK\"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"200\n",
|
||||||
|
"None\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"resp = client._client.request(\n",
|
||||||
|
" \"POST\",\n",
|
||||||
|
" \"/vector-io/insert\",\n",
|
||||||
|
" json={\n",
|
||||||
|
" \"vector_db_id\": \"acme_docs_v2\",\n",
|
||||||
|
" \"chunks\": chunks,\n",
|
||||||
|
" \"ttl_seconds\": 3600, # optional\n",
|
||||||
|
" }\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"print(resp.status_code)\n",
|
||||||
|
"print(resp.json()) # might be empty if API returns None"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "search-section",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## 🔍 Semantic Search Query\n",
|
||||||
|
"\n",
|
||||||
|
"Perform a **semantic search** on our documents. The query \"How long does Acme take to ship orders?\" will be converted to an embedding and matched against stored document embeddings to find the most relevant content.\n",
|
||||||
|
"\n",
|
||||||
|
"The results show the most relevant chunks ranked by semantic similarity, with metadata and content for each match."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 31,
|
||||||
|
"id": "faa7e0ab-aef9-496f-a2bc-90b4eb8bc860",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stderr",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"INFO:httpx:HTTP Request: POST http://localhost:8321/v1/vector-io/query \"HTTP/1.1 200 OK\"\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"Chunk 1:\n",
|
||||||
|
" Metadata: {'document_id': 'Shipping Policy'}\n",
|
||||||
|
" Content : Acme ships globally in 3-5 business days.\n",
|
||||||
|
"--------------------------------------------------\n",
|
||||||
|
"Chunk 2:\n",
|
||||||
|
" Metadata: {'document_id': 'Shipping Policy'}\n",
|
||||||
|
" Content : Acme ships globally in 3-5 business days.\n",
|
||||||
|
"--------------------------------------------------\n",
|
||||||
|
"Chunk 3:\n",
|
||||||
|
" Metadata: {'document_id': 'Returns Policy'}\n",
|
||||||
|
" Content : Returns are accepted within 30 days of purchase.\n",
|
||||||
|
"--------------------------------------------------\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"query = \"How long does Acme take to ship orders?\"\n",
|
||||||
|
"\n",
|
||||||
|
"resp = client._client.request(\n",
|
||||||
|
" \"POST\",\n",
|
||||||
|
" \"/vector-io/query\", # endpoint for vector queries\n",
|
||||||
|
" json={\n",
|
||||||
|
" \"vector_db_id\": \"acme_docs_v2\",\n",
|
||||||
|
" \"query\": query,\n",
|
||||||
|
" \"top_k\": 5 # optional, number of results to return\n",
|
||||||
|
" }\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"# Convert response to Python dictionary\n",
|
||||||
|
"data = resp.json()\n",
|
||||||
|
"\n",
|
||||||
|
"# Loop through returned chunks\n",
|
||||||
|
"for i, chunk in enumerate(data.get(\"chunks\", []), start=1):\n",
|
||||||
|
" print(f\"Chunk {i}:\")\n",
|
||||||
|
" print(\" Metadata:\", chunk.get(\"metadata\"))\n",
|
||||||
|
" print(\" Content :\", chunk.get(\"content\"))\n",
|
||||||
|
" print(\"-\" * 50)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": null,
|
||||||
|
"id": "aa803d10-8f3f-40e0-b621-51c847778e05",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": []
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"kernelspec": {
|
||||||
|
"display_name": "Python 3 (ipykernel)",
|
||||||
|
"language": "python",
|
||||||
|
"name": "python3"
|
||||||
|
},
|
||||||
|
"language_info": {
|
||||||
|
"codemirror_mode": {
|
||||||
|
"name": "ipython",
|
||||||
|
"version": 3
|
||||||
|
},
|
||||||
|
"file_extension": ".py",
|
||||||
|
"mimetype": "text/x-python",
|
||||||
|
"name": "python",
|
||||||
|
"nbconvert_exporter": "python",
|
||||||
|
"pygments_lexer": "ipython3",
|
||||||
|
"version": "3.13.7"
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"nbformat": 4,
|
||||||
|
"nbformat_minor": 5
|
||||||
|
}
|
Loading…
Add table
Add a link
Reference in a new issue