mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-10-06 12:37:33 +00:00
Use Vector Store and fix other comments
This commit is contained in:
parent
04b0432a7b
commit
7ae7d6a7d0
1 changed files with 113 additions and 294 deletions
|
@ -11,20 +11,20 @@
|
|||
"\n",
|
||||
"### Overview\n",
|
||||
"\n",
|
||||
"- **LlamaStack**: Provides the infrastructure for running LLMs and vector databases\n",
|
||||
"- **LlamaStack**: Provides the infrastructure for running LLMs and Open AI Compatible Vector Stores\n",
|
||||
"- **LangChain**: Provides the framework for chaining operations and prompt templates\n",
|
||||
"- **Integration**: Uses LlamaStack's OpenAI-compatible API with LangChain\n",
|
||||
"\n",
|
||||
"### What You'll See\n",
|
||||
"\n",
|
||||
"1. Setting up LlamaStack server with Together AI provider\n",
|
||||
"2. Creating and managing vector databases\n",
|
||||
"2. Creating and Querying Vector Stores\n",
|
||||
"3. Building RAG chains with LangChain + LLAMAStack\n",
|
||||
"4. Querying the chain for relevant information\n",
|
||||
"\n",
|
||||
"### Prerequisites\n",
|
||||
"\n",
|
||||
"- Together AI API key\n",
|
||||
"- Fireworks API key\n",
|
||||
"\n",
|
||||
"---\n",
|
||||
"\n",
|
||||
|
@ -53,68 +53,15 @@
|
|||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Requirement already satisfied: fastapi in /Users/swapna942/miniconda3/lib/python3.12/site-packages (0.115.14)\n",
|
||||
"Requirement already satisfied: uvicorn in /Users/swapna942/miniconda3/lib/python3.12/site-packages (0.29.0)\n",
|
||||
"Requirement already satisfied: langchain>=0.2 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (0.3.27)\n",
|
||||
"Requirement already satisfied: langchain-openai in /Users/swapna942/miniconda3/lib/python3.12/site-packages (0.3.30)\n",
|
||||
"Requirement already satisfied: langchain-community in /Users/swapna942/miniconda3/lib/python3.12/site-packages (0.3.27)\n",
|
||||
"Requirement already satisfied: langchain-text-splitters in /Users/swapna942/miniconda3/lib/python3.12/site-packages (0.3.9)\n",
|
||||
"Requirement already satisfied: faiss-cpu in /Users/swapna942/miniconda3/lib/python3.12/site-packages (1.11.0)\n",
|
||||
"Requirement already satisfied: starlette<0.47.0,>=0.40.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from fastapi) (0.46.2)\n",
|
||||
"Requirement already satisfied: pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,!=2.1.0,<3.0.0,>=1.7.4 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from fastapi) (2.11.7)\n",
|
||||
"Requirement already satisfied: typing-extensions>=4.8.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from fastapi) (4.14.1)\n",
|
||||
"Requirement already satisfied: annotated-types>=0.6.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,!=2.1.0,<3.0.0,>=1.7.4->fastapi) (0.7.0)\n",
|
||||
"Requirement already satisfied: pydantic-core==2.33.2 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,!=2.1.0,<3.0.0,>=1.7.4->fastapi) (2.33.2)\n",
|
||||
"Requirement already satisfied: typing-inspection>=0.4.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,!=2.1.0,<3.0.0,>=1.7.4->fastapi) (0.4.1)\n",
|
||||
"Requirement already satisfied: anyio<5,>=3.6.2 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from starlette<0.47.0,>=0.40.0->fastapi) (4.10.0)\n",
|
||||
"Requirement already satisfied: idna>=2.8 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from anyio<5,>=3.6.2->starlette<0.47.0,>=0.40.0->fastapi) (3.10)\n",
|
||||
"Requirement already satisfied: sniffio>=1.1 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from anyio<5,>=3.6.2->starlette<0.47.0,>=0.40.0->fastapi) (1.3.1)\n",
|
||||
"Requirement already satisfied: click>=7.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from uvicorn) (8.2.1)\n",
|
||||
"Requirement already satisfied: h11>=0.8 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from uvicorn) (0.16.0)\n",
|
||||
"Requirement already satisfied: langchain-core<1.0.0,>=0.3.72 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langchain>=0.2) (0.3.74)\n",
|
||||
"Requirement already satisfied: langsmith>=0.1.17 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langchain>=0.2) (0.4.14)\n",
|
||||
"Requirement already satisfied: SQLAlchemy<3,>=1.4 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langchain>=0.2) (2.0.41)\n",
|
||||
"Requirement already satisfied: requests<3,>=2 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langchain>=0.2) (2.32.4)\n",
|
||||
"Requirement already satisfied: PyYAML>=5.3 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langchain>=0.2) (6.0.2)\n",
|
||||
"Requirement already satisfied: tenacity!=8.4.0,<10.0.0,>=8.1.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langchain-core<1.0.0,>=0.3.72->langchain>=0.2) (9.1.2)\n",
|
||||
"Requirement already satisfied: jsonpatch<2.0,>=1.33 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langchain-core<1.0.0,>=0.3.72->langchain>=0.2) (1.33)\n",
|
||||
"Requirement already satisfied: packaging>=23.2 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langchain-core<1.0.0,>=0.3.72->langchain>=0.2) (24.2)\n",
|
||||
"Requirement already satisfied: jsonpointer>=1.9 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from jsonpatch<2.0,>=1.33->langchain-core<1.0.0,>=0.3.72->langchain>=0.2) (2.1)\n",
|
||||
"Requirement already satisfied: charset_normalizer<4,>=2 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from requests<3,>=2->langchain>=0.2) (3.3.2)\n",
|
||||
"Requirement already satisfied: urllib3<3,>=1.21.1 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from requests<3,>=2->langchain>=0.2) (2.5.0)\n",
|
||||
"Requirement already satisfied: certifi>=2017.4.17 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from requests<3,>=2->langchain>=0.2) (2025.8.3)\n",
|
||||
"Requirement already satisfied: openai<2.0.0,>=1.99.9 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langchain-openai) (1.100.2)\n",
|
||||
"Requirement already satisfied: tiktoken<1,>=0.7 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langchain-openai) (0.9.0)\n",
|
||||
"Requirement already satisfied: distro<2,>=1.7.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from openai<2.0.0,>=1.99.9->langchain-openai) (1.9.0)\n",
|
||||
"Requirement already satisfied: httpx<1,>=0.23.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from openai<2.0.0,>=1.99.9->langchain-openai) (0.28.1)\n",
|
||||
"Requirement already satisfied: jiter<1,>=0.4.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from openai<2.0.0,>=1.99.9->langchain-openai) (0.10.0)\n",
|
||||
"Requirement already satisfied: tqdm>4 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from openai<2.0.0,>=1.99.9->langchain-openai) (4.67.1)\n",
|
||||
"Requirement already satisfied: httpcore==1.* in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from httpx<1,>=0.23.0->openai<2.0.0,>=1.99.9->langchain-openai) (1.0.9)\n",
|
||||
"Requirement already satisfied: regex>=2022.1.18 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from tiktoken<1,>=0.7->langchain-openai) (2024.11.6)\n",
|
||||
"Requirement already satisfied: aiohttp<4.0.0,>=3.8.3 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langchain-community) (3.12.13)\n",
|
||||
"Requirement already satisfied: dataclasses-json<0.7,>=0.5.7 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langchain-community) (0.6.7)\n",
|
||||
"Requirement already satisfied: pydantic-settings<3.0.0,>=2.4.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langchain-community) (2.10.1)\n",
|
||||
"Requirement already satisfied: httpx-sse<1.0.0,>=0.4.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langchain-community) (0.4.1)\n",
|
||||
"Requirement already satisfied: numpy>=1.26.2 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langchain-community) (2.3.1)\n",
|
||||
"Requirement already satisfied: aiohappyeyeballs>=2.5.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain-community) (2.6.1)\n",
|
||||
"Requirement already satisfied: aiosignal>=1.1.2 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain-community) (1.4.0)\n",
|
||||
"Requirement already satisfied: attrs>=17.3.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain-community) (25.3.0)\n",
|
||||
"Requirement already satisfied: frozenlist>=1.1.1 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain-community) (1.7.0)\n",
|
||||
"Requirement already satisfied: multidict<7.0,>=4.5 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain-community) (6.6.3)\n",
|
||||
"Requirement already satisfied: propcache>=0.2.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain-community) (0.3.2)\n",
|
||||
"Requirement already satisfied: yarl<2.0,>=1.17.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain-community) (1.20.1)\n",
|
||||
"Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from dataclasses-json<0.7,>=0.5.7->langchain-community) (3.26.1)\n",
|
||||
"Requirement already satisfied: typing-inspect<1,>=0.4.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from dataclasses-json<0.7,>=0.5.7->langchain-community) (0.9.0)\n",
|
||||
"Requirement already satisfied: python-dotenv>=0.21.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from pydantic-settings<3.0.0,>=2.4.0->langchain-community) (1.1.1)\n",
|
||||
"Requirement already satisfied: mypy-extensions>=0.3.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from typing-inspect<1,>=0.4.0->dataclasses-json<0.7,>=0.5.7->langchain-community) (1.1.0)\n",
|
||||
"Requirement already satisfied: orjson>=3.9.14 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langsmith>=0.1.17->langchain>=0.2) (3.10.18)\n",
|
||||
"Requirement already satisfied: requests-toolbelt>=1.0.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langsmith>=0.1.17->langchain>=0.2) (1.0.0)\n",
|
||||
"Requirement already satisfied: zstandard>=0.23.0 in /Users/swapna942/miniconda3/lib/python3.12/site-packages (from langsmith>=0.1.17->langchain>=0.2) (0.23.0)\n"
|
||||
"Requirement already satisfied: uv in /Users/swapna942/miniconda3/lib/python3.12/site-packages (0.7.20)\n",
|
||||
"\u001b[2mUsing Python 3.12.11 environment at: /Users/swapna942/miniconda3\u001b[0m\n",
|
||||
"\u001b[2mAudited \u001b[1m7 packages\u001b[0m \u001b[2min 94ms\u001b[0m\u001b[0m\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"!pip install fastapi uvicorn \"langchain>=0.2\" langchain-openai \\\n",
|
||||
"!pip install uv\n",
|
||||
"!uv pip install fastapi uvicorn \"langchain>=0.2\" langchain-openai \\\n",
|
||||
" langchain-community langchain-text-splitters \\\n",
|
||||
" faiss-cpu"
|
||||
]
|
||||
|
@ -130,7 +77,6 @@
|
|||
"\n",
|
||||
"This section sets up the LlamaStack server with:\n",
|
||||
"- **Together AI** as the inference provider\n",
|
||||
"- **FAISS** as the vector database\n",
|
||||
"- **Sentence Transformers** for embeddings\n",
|
||||
"\n",
|
||||
"The server runs on `localhost:8321` and provides OpenAI-compatible endpoints."
|
||||
|
@ -143,67 +89,30 @@
|
|||
"metadata": {
|
||||
"scrolled": true
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Requirement already satisfied: uv in /Users/swapna942/miniconda3/lib/python3.12/site-packages (0.7.20)\n",
|
||||
"Environment '/Users/swapna942/llama-stack/.venv' already exists, re-using it.\n",
|
||||
"Virtual environment /Users/swapna942/llama-stack/.venv is already active\n",
|
||||
"\u001b[2mUsing Python 3.13.7 environment at: /Users/swapna942/llama-stack/.venv\u001b[0m\n",
|
||||
"\u001b[2mAudited \u001b[1m1 package\u001b[0m \u001b[2min 111ms\u001b[0m\u001b[0m\n",
|
||||
"Installing pip dependencies\n",
|
||||
"\u001b[2mUsing Python 3.13.7 environment at: /Users/swapna942/llama-stack/.venv\u001b[0m\n",
|
||||
"\u001b[2K\u001b[2mResolved \u001b[1m184 packages\u001b[0m \u001b[2min 357ms\u001b[0m\u001b[0m \u001b[0m\n",
|
||||
"\u001b[2mUninstalled \u001b[1m3 packages\u001b[0m \u001b[2min 70ms\u001b[0m\u001b[0m\n",
|
||||
"\u001b[2K\u001b[2mInstalled \u001b[1m3 packages\u001b[0m \u001b[2min 35ms\u001b[0m\u001b[0m \u001b[0m\n",
|
||||
" \u001b[31m-\u001b[39m \u001b[1mprotobuf\u001b[0m\u001b[2m==5.29.5\u001b[0m\n",
|
||||
" \u001b[32m+\u001b[39m \u001b[1mprotobuf\u001b[0m\u001b[2m==5.29.4\u001b[0m\n",
|
||||
" \u001b[31m-\u001b[39m \u001b[1mruamel-yaml\u001b[0m\u001b[2m==0.18.14\u001b[0m\n",
|
||||
" \u001b[32m+\u001b[39m \u001b[1mruamel-yaml\u001b[0m\u001b[2m==0.17.40\u001b[0m\n",
|
||||
" \u001b[31m-\u001b[39m \u001b[1mruff\u001b[0m\u001b[2m==0.12.5\u001b[0m\n",
|
||||
" \u001b[32m+\u001b[39m \u001b[1mruff\u001b[0m\u001b[2m==0.9.10\u001b[0m\n",
|
||||
"Installing special provider module: torch torchtune>=0.5.0 torchao>=0.12.0 --extra-index-url https://download.pytorch.org/whl/cpu\n",
|
||||
"\u001b[2mUsing Python 3.13.7 environment at: /Users/swapna942/llama-stack/.venv\u001b[0m\n",
|
||||
"\u001b[2mAudited \u001b[1m3 packages\u001b[0m \u001b[2min 69ms\u001b[0m\u001b[0m\n",
|
||||
"Installing special provider module: torch torchvision torchao>=0.12.0 --extra-index-url https://download.pytorch.org/whl/cpu\n",
|
||||
"\u001b[2mUsing Python 3.13.7 environment at: /Users/swapna942/llama-stack/.venv\u001b[0m\n",
|
||||
"\u001b[2mAudited \u001b[1m3 packages\u001b[0m \u001b[2min 12ms\u001b[0m\u001b[0m\n",
|
||||
"Installing special provider module: sentence-transformers --no-deps\n",
|
||||
"\u001b[2mUsing Python 3.13.7 environment at: /Users/swapna942/llama-stack/.venv\u001b[0m\n",
|
||||
"\u001b[2mAudited \u001b[1m1 package\u001b[0m \u001b[2min 15ms\u001b[0m\u001b[0m\n",
|
||||
"\u001b[32mBuild Successful!\u001b[0m\n",
|
||||
"\u001b[34mYou can find the newly-built distribution here: /Users/swapna942/.llama/distributions/starter/starter-run.yaml\u001b[0m\n",
|
||||
"\u001b[32mYou can run the new Llama Stack distro via: \u001b[34mllama stack run /Users/swapna942/.llama/distributions/starter/starter-run.yaml --image-type venv\u001b[0m\u001b[0m\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"import subprocess\n",
|
||||
"import time\n",
|
||||
"\n",
|
||||
"!pip install uv\n",
|
||||
"\n",
|
||||
"# Remove UV_SYSTEM_PYTHON to ensure uv creates a proper virtual environment\n",
|
||||
"# instead of trying to use system Python globally, which could cause permission issues\n",
|
||||
"# and package conflicts with the system's Python installation\n",
|
||||
"if \"UV_SYSTEM_PYTHON\" in os.environ:\n",
|
||||
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
|
||||
"\n",
|
||||
"# this command installs all the dependencies needed for the llama stack server with the together inference provider\n",
|
||||
"!uv run --with llama-stack llama stack build --distro starter --image-type venv\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def run_llama_stack_server_background():\n",
|
||||
" \"\"\"Build and run LlamaStack server in one step using --run flag\"\"\"\n",
|
||||
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
|
||||
" process = subprocess.Popen(\n",
|
||||
" \"uv run --with llama-stack llama stack run starter --image-type venv\",\n",
|
||||
" \"uv run --with llama-stack llama stack build --distro starter --image-type venv --run\",\n",
|
||||
" shell=True,\n",
|
||||
" stdout=log_file,\n",
|
||||
" stderr=log_file,\n",
|
||||
" text=True,\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
" print(f\"Starting Llama Stack server with PID: {process.pid}\")\n",
|
||||
" print(f\"Building and starting Llama Stack server with PID: {process.pid}\")\n",
|
||||
" return process\n",
|
||||
"\n",
|
||||
"\n",
|
||||
|
@ -230,10 +139,9 @@
|
|||
" return False\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# use this helper if needed to kill the server\n",
|
||||
"def kill_llama_stack_server():\n",
|
||||
" # Kill any existing llama stack server processes\n",
|
||||
" os.system(\"ps aux | grep -v grep | grep llama_stack.core.server.server | awk '{print $2}' | xargs kill -9\")"
|
||||
" # Kill any existing llama stack server processes using pkill command\n",
|
||||
" os.system(\"pkill -f llama_stack.core.server.server\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -246,8 +154,8 @@
|
|||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Starting Llama Stack server with PID: 99381\n",
|
||||
"Waiting for server to start....\n",
|
||||
"Building and starting Llama Stack server with PID: 81294\n",
|
||||
"Waiting for server to start......\n",
|
||||
"Server is ready!\n"
|
||||
]
|
||||
}
|
||||
|
@ -279,59 +187,13 @@
|
|||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Requirement already satisfied: llama_stack_client in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (0.2.17)\n",
|
||||
"Requirement already satisfied: anyio<5,>=3.5.0 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from llama_stack_client) (4.9.0)\n",
|
||||
"Requirement already satisfied: click in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from llama_stack_client) (8.2.1)\n",
|
||||
"Requirement already satisfied: distro<2,>=1.7.0 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from llama_stack_client) (1.9.0)\n",
|
||||
"Requirement already satisfied: fire in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from llama_stack_client) (0.7.0)\n",
|
||||
"Requirement already satisfied: httpx<1,>=0.23.0 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from llama_stack_client) (0.28.1)\n",
|
||||
"Requirement already satisfied: pandas in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from llama_stack_client) (2.3.1)\n",
|
||||
"Requirement already satisfied: prompt-toolkit in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from llama_stack_client) (3.0.51)\n",
|
||||
"Requirement already satisfied: pyaml in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from llama_stack_client) (25.7.0)\n",
|
||||
"Requirement already satisfied: pydantic<3,>=1.9.0 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from llama_stack_client) (2.11.7)\n",
|
||||
"Requirement already satisfied: requests in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from llama_stack_client) (2.32.4)\n",
|
||||
"Requirement already satisfied: rich in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from llama_stack_client) (14.1.0)\n",
|
||||
"Requirement already satisfied: sniffio in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from llama_stack_client) (1.3.1)\n",
|
||||
"Requirement already satisfied: termcolor in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from llama_stack_client) (3.1.0)\n",
|
||||
"Requirement already satisfied: tqdm in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from llama_stack_client) (4.67.1)\n",
|
||||
"Requirement already satisfied: typing-extensions<5,>=4.7 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from llama_stack_client) (4.14.1)\n",
|
||||
"Requirement already satisfied: idna>=2.8 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from anyio<5,>=3.5.0->llama_stack_client) (3.10)\n",
|
||||
"Requirement already satisfied: certifi in /opt/homebrew/opt/certifi/lib/python3.13/site-packages (from httpx<1,>=0.23.0->llama_stack_client) (2025.8.3)\n",
|
||||
"Requirement already satisfied: httpcore==1.* in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from httpx<1,>=0.23.0->llama_stack_client) (1.0.9)\n",
|
||||
"Requirement already satisfied: h11>=0.16 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from httpcore==1.*->httpx<1,>=0.23.0->llama_stack_client) (0.16.0)\n",
|
||||
"Requirement already satisfied: annotated-types>=0.6.0 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from pydantic<3,>=1.9.0->llama_stack_client) (0.7.0)\n",
|
||||
"Requirement already satisfied: pydantic-core==2.33.2 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from pydantic<3,>=1.9.0->llama_stack_client) (2.33.2)\n",
|
||||
"Requirement already satisfied: typing-inspection>=0.4.0 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from pydantic<3,>=1.9.0->llama_stack_client) (0.4.1)\n",
|
||||
"Requirement already satisfied: numpy>=1.26.0 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from pandas->llama_stack_client) (2.3.2)\n",
|
||||
"Requirement already satisfied: python-dateutil>=2.8.2 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from pandas->llama_stack_client) (2.9.0.post0)\n",
|
||||
"Requirement already satisfied: pytz>=2020.1 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from pandas->llama_stack_client) (2025.2)\n",
|
||||
"Requirement already satisfied: tzdata>=2022.7 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from pandas->llama_stack_client) (2025.2)\n",
|
||||
"Requirement already satisfied: six>=1.5 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from python-dateutil>=2.8.2->pandas->llama_stack_client) (1.17.0)\n",
|
||||
"Requirement already satisfied: wcwidth in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from prompt-toolkit->llama_stack_client) (0.2.13)\n",
|
||||
"Requirement already satisfied: PyYAML in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from pyaml->llama_stack_client) (6.0.2)\n",
|
||||
"Requirement already satisfied: charset_normalizer<4,>=2 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from requests->llama_stack_client) (3.4.2)\n",
|
||||
"Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from requests->llama_stack_client) (2.5.0)\n",
|
||||
"Requirement already satisfied: markdown-it-py>=2.2.0 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from rich->llama_stack_client) (4.0.0)\n",
|
||||
"Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from rich->llama_stack_client) (2.19.2)\n",
|
||||
"Requirement already satisfied: mdurl~=0.1 in /opt/homebrew/Cellar/jupyterlab/4.4.5/libexec/lib/python3.13/site-packages (from markdown-it-py>=2.2.0->rich->llama_stack_client) (0.1.2)\n"
|
||||
"\u001b[2mUsing Python 3.12.11 environment at: /Users/swapna942/miniconda3\u001b[0m\n",
|
||||
"\u001b[2mAudited \u001b[1m1 package\u001b[0m \u001b[2min 26ms\u001b[0m\u001b[0m\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"0"
|
||||
]
|
||||
},
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import sys\n",
|
||||
"\n",
|
||||
"# Install directly to the current Python environment\n",
|
||||
"subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"llama_stack_client\"])"
|
||||
"!uv pip install llama_stack_client"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -343,7 +205,7 @@
|
|||
"\n",
|
||||
"Create a client connection to the LlamaStack server with API keys for different providers:\n",
|
||||
"\n",
|
||||
"- **Together API Key**: For Together AI models\n",
|
||||
"- **Fireworks API Key**: For Together AI models\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
|
@ -358,7 +220,7 @@
|
|||
"\n",
|
||||
"client = LlamaStackClient(\n",
|
||||
" base_url=\"http://0.0.0.0:8321\",\n",
|
||||
" provider_data={\"together_api_key\": \"***\"},\n",
|
||||
" provider_data={\"fireworks_api_key\": \"fw_3ZPgfLt28zCzx9n4KYYPTT7w\"},\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
|
@ -508,14 +370,14 @@
|
|||
"id": "gojp7at31ht",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### 4. Vector Database Setup\n",
|
||||
"### 4. Vector Store Setup\n",
|
||||
"\n",
|
||||
"#### Register a Vector Database\n",
|
||||
"#### Create a Vector Store with File Upload\n",
|
||||
"\n",
|
||||
"Create a FAISS vector database for storing document embeddings:\n",
|
||||
"Create a vector store using the OpenAI-compatible vector stores API:\n",
|
||||
"\n",
|
||||
"- **Vector DB ID**: Unique identifier for the database\n",
|
||||
"- **Provider**: FAISS (Facebook AI Similarity Search)\n",
|
||||
"- **Vector Store**: OpenAI-compatible vector store for document storage\n",
|
||||
"- **File Upload**: Automatic chunking and embedding of uploaded files \n",
|
||||
"- **Embedding Model**: Sentence Transformers model for text embeddings\n",
|
||||
"- **Dimensions**: 384-dimensional embeddings"
|
||||
]
|
||||
|
@ -523,60 +385,37 @@
|
|||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"id": "a16e2885-ae70-4fa6-9778-2433fa4dbfff",
|
||||
"id": "be2c2899-ea53-4e5f-b6b8-ed425f5d6572",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/vector-dbs \"HTTP/1.1 200 OK\"\n",
|
||||
"INFO:httpx:HTTP Request: GET http://0.0.0.0:8321/v1/vector-dbs \"HTTP/1.1 200 OK\"\n"
|
||||
"INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/openai/v1/files \"HTTP/1.1 200 OK\"\n",
|
||||
"INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/openai/v1/files \"HTTP/1.1 200 OK\"\n",
|
||||
"INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/openai/v1/files \"HTTP/1.1 200 OK\"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Registered new vector DB: VectorDBRegisterResponse(embedding_dimension=384, embedding_model='sentence-transformers/all-MiniLM-L6-v2', identifier='acme_docs', provider_id='faiss', type='vector_db', provider_resource_id='acme_docs_v2', owner=None, source='via_register_api', vector_db_name=None)\n",
|
||||
"Existing vector DBs: [VectorDBListResponseItem(embedding_dimension=384, embedding_model='sentence-transformers/all-MiniLM-L6-v2', identifier='acme_docs', provider_id='faiss', type='vector_db', provider_resource_id='acme_docs_v2', vector_db_name=None)]\n"
|
||||
"File(id='file-33012cc7c18a4430ad1a11b3397c3e29', bytes=41, created_at=1757441277, expires_at=1788977277, filename='shipping_policy.txt', object='file', purpose='assistants')\n",
|
||||
"File(id='file-e7692c2f044e44859ea02b0630195b4e', bytes=48, created_at=1757441277, expires_at=1788977277, filename='returns_policy.txt', object='file', purpose='assistants')\n",
|
||||
"File(id='file-9ff178cbb72b4e79a7513ce5a95a6fe1', bytes=45, created_at=1757441277, expires_at=1788977277, filename='support.txt', object='file', purpose='assistants')\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/openai/v1/vector_stores \"HTTP/1.1 200 OK\"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Register a new clean vector database\n",
|
||||
"vector_db = client.vector_dbs.register(\n",
|
||||
" vector_db_id=\"acme_docs\", # Use a new unique name\n",
|
||||
" provider_id=\"faiss\",\n",
|
||||
" provider_vector_db_id=\"acme_docs_v2\",\n",
|
||||
" embedding_model=\"sentence-transformers/all-MiniLM-L6-v2\",\n",
|
||||
" embedding_dimension=384,\n",
|
||||
")\n",
|
||||
"print(\"Registered new vector DB:\", vector_db)\n",
|
||||
"\n",
|
||||
"# List all registered vector databases\n",
|
||||
"dbs = client.vector_dbs.list()\n",
|
||||
"print(\"Existing vector DBs:\", dbs)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "pcgjqzfr3eo",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Prepare Sample Documents\n",
|
||||
"\n",
|
||||
"Create LLAMA Stack Chunks for FAISS vector store"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"id": "5a0a6619-c9fb-4938-8ff3-f84304eed91e",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from llama_stack_client.types.vector_io_insert_params import Chunk\n",
|
||||
"from io import BytesIO\n",
|
||||
"\n",
|
||||
"docs = [\n",
|
||||
" (\"Acme ships globally in 3-5 business days.\", {\"title\": \"Shipping Policy\"}),\n",
|
||||
|
@ -584,49 +423,22 @@
|
|||
" (\"Support is available 24/7 via chat and email.\", {\"title\": \"Support\"}),\n",
|
||||
"]\n",
|
||||
"\n",
|
||||
"# Convert to Chunk objects\n",
|
||||
"chunks = []\n",
|
||||
"for _, (content, metadata) in enumerate(docs):\n",
|
||||
" # Transform metadata to required format with document_id from title\n",
|
||||
" metadata = {\"document_id\": metadata[\"title\"]}\n",
|
||||
" chunk = Chunk(\n",
|
||||
" content=content, # Required[InterleavedContent]\n",
|
||||
" metadata=metadata, # Required[Dict]\n",
|
||||
" )\n",
|
||||
" chunks.append(chunk)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "6bg3sm2ko5g",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Insert Documents into Vector Database\n",
|
||||
"file_ids = []\n",
|
||||
"for content, metadata in docs:\n",
|
||||
" with BytesIO(content.encode()) as file_buffer:\n",
|
||||
" file_buffer.name = f\"{metadata['title'].replace(' ', '_').lower()}.txt\"\n",
|
||||
" create_file_response = client.files.create(file=file_buffer, purpose=\"assistants\")\n",
|
||||
" print(create_file_response)\n",
|
||||
" file_ids.append(create_file_response.id)\n",
|
||||
"\n",
|
||||
"Store the prepared documents in the FAISS vector database. This process:\n",
|
||||
"1. Generates embeddings for each document\n",
|
||||
"2. Stores embeddings with metadata\n",
|
||||
"3. Enables semantic search capabilities"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"id": "0e8740d8-b809-44b9-915f-1e0200e3c3f1",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/vector-io/insert \"HTTP/1.1 200 OK\"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Insert chunks into FAISS vector store\n",
|
||||
"\n",
|
||||
"response = client.vector_io.insert(vector_db_id=\"acme_docs\", chunks=chunks)"
|
||||
"# Create vector store with files\n",
|
||||
"vector_store = client.vector_stores.create(\n",
|
||||
" name=\"acme_docs\",\n",
|
||||
" file_ids=file_ids,\n",
|
||||
" embedding_model=\"sentence-transformers/all-MiniLM-L6-v2\",\n",
|
||||
" embedding_dimension=384,\n",
|
||||
" provider_id=\"faiss\"\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -634,47 +446,42 @@
|
|||
"id": "9061tmi1zpq",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Test Vector Search\n",
|
||||
"#### Test Vector Store Search\n",
|
||||
"\n",
|
||||
"Query the vector database to verify it's working correctly. This performs semantic search to find relevant documents based on the query."
|
||||
"Query the vector store. This performs semantic search to find relevant documents based on the query."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"id": "4a5e010c-eeeb-4020-a957-74d6d1cba342",
|
||||
"execution_count": 8,
|
||||
"id": "ba9d1901-bd5e-4216-b3e6-19dc74551cc6",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/vector-io/query \"HTTP/1.1 200 OK\"\n"
|
||||
"INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/openai/v1/vector_stores/vs_d9fa50e8-3ccc-4db3-9102-1a71cd62ab64/search \"HTTP/1.1 200 OK\"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"metadata : {'document_id': 'Shipping Policy'}\n",
|
||||
"content : Acme ships globally in 3–5 business days.\n",
|
||||
"metadata : {'document_id': 'Shipping Policy'}\n",
|
||||
"content : Acme ships globally in 3–5 business days.\n",
|
||||
"metadata : {'document_id': 'Shipping Policy'}\n",
|
||||
"content : Acme ships globally in 3-5 business days.\n"
|
||||
"Acme ships globally in 3-5 business days.\n",
|
||||
"Returns are accepted within 30 days of purchase.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Query chunks from FAISS vector store\n",
|
||||
"\n",
|
||||
"query_chunk_response = client.vector_io.query(\n",
|
||||
" vector_db_id=\"acme_docs\",\n",
|
||||
" query=\"How long does Acme take to ship orders?\",\n",
|
||||
"search_response = client.vector_stores.search(\n",
|
||||
" vector_store_id=vector_store.id,\n",
|
||||
" query=\"How long does shipping take?\",\n",
|
||||
" max_num_results=2\n",
|
||||
")\n",
|
||||
"for chunk in query_chunk_response.chunks:\n",
|
||||
" print(\"metadata\", \":\", chunk.metadata)\n",
|
||||
" print(\"content\", \":\", chunk.content)"
|
||||
"for result in search_response.data:\n",
|
||||
" content = result.content[0].text\n",
|
||||
" print(content)"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -695,7 +502,7 @@
|
|||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"execution_count": 9,
|
||||
"id": "c378bd10-09c2-417c-bdfc-1e0a2dd19084",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
|
@ -705,13 +512,11 @@
|
|||
"from langchain_openai import ChatOpenAI\n",
|
||||
"\n",
|
||||
"# Point LangChain to Llamastack Server\n",
|
||||
"os.environ[\"OPENAI_API_KEY\"] = \"dummy\"\n",
|
||||
"os.environ[\"OPENAI_BASE_URL\"] = \"http://0.0.0.0:8321/v1/openai/v1\"\n",
|
||||
"\n",
|
||||
"# LLM from Llamastack together model\n",
|
||||
"llm = ChatOpenAI(\n",
|
||||
" model=\"together/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo\",\n",
|
||||
" default_headers={\"X-LlamaStack-Provider-Data\": '{\"together_api_key\": \"***\"}'},\n",
|
||||
" base_url=\"http://0.0.0.0:8321/v1/openai/v1\",\n",
|
||||
" api_key=\"dummy\",\n",
|
||||
" model=\"fireworks/accounts/fireworks/models/llama-v3p1-8b-instruct\",\n",
|
||||
" default_headers={\"X-LlamaStack-Provider-Data\": '{\"fireworks_api_key\": \"fw_3ZPgfLt28zCzx9n4KYYPTT7w\"}'},\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
|
@ -727,9 +532,11 @@
|
|||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 12,
|
||||
"execution_count": 10,
|
||||
"id": "f88ffb5a-657b-4916-9375-c6ddc156c25e",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"scrolled": true
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
|
@ -741,10 +548,10 @@
|
|||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"AIMessage(content='With gentle eyes and soft, fuzzy hair,\\nThe llama roams, its beauty beyond compare.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 20, 'prompt_tokens': 50, 'total_tokens': 70, 'completion_tokens_details': None, 'prompt_tokens_details': None, 'cached_tokens': 0}, 'model_name': 'meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo', 'system_fingerprint': None, 'id': 'o9gGhuc-4YNCb4-9790ba4bba2f1754', 'service_tier': None, 'finish_reason': 'stop', 'logprobs': None}, id='run--ff54cf64-6423-4997-b4da-5f4852da0c7e-0', usage_metadata={'input_tokens': 50, 'output_tokens': 20, 'total_tokens': 70, 'input_token_details': {}, 'output_token_details': {}})"
|
||||
"AIMessage(content='Silent llama, soft and slow,\\n Majestic creature, with a gentle glow.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': None, 'model_name': 'fireworks/accounts/fireworks/models/llama-v3p1-8b-instruct', 'system_fingerprint': None, 'id': 'chatcmpl-8957d071-9f1a-4a7f-9c0e-d516e02194d1', 'service_tier': None, 'finish_reason': 'stop', 'logprobs': None}, id='run--f4331df3-e787-4973-b2e8-7ca90d4e19df-0')"
|
||||
]
|
||||
},
|
||||
"execution_count": 12,
|
||||
"execution_count": 10,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
|
@ -780,7 +587,7 @@
|
|||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 13,
|
||||
"execution_count": 11,
|
||||
"id": "9684427d-dcc7-4544-9af5-8b110d014c42",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
|
@ -792,8 +599,7 @@
|
|||
"\n",
|
||||
"\n",
|
||||
"def join_docs(docs):\n",
|
||||
" return \"\\n\\n\".join([f\"[{d.metadata.get('document_id')}] {d.content}\" for d in docs.chunks])\n",
|
||||
"\n",
|
||||
" return \"\\n\\n\".join([f\"[{d.filename}] {d.content[0].text}\" for d in docs.data])\n",
|
||||
"\n",
|
||||
"PROMPT = ChatPromptTemplate.from_messages(\n",
|
||||
" [\n",
|
||||
|
@ -803,11 +609,12 @@
|
|||
")\n",
|
||||
"\n",
|
||||
"vector_step = RunnableLambda(\n",
|
||||
" lambda x: client.vector_io.query(\n",
|
||||
" vector_db_id=\"acme_docs\",\n",
|
||||
" query=x,\n",
|
||||
" )\n",
|
||||
")\n",
|
||||
" lambda x: client.vector_stores.search(\n",
|
||||
" vector_store_id=vector_store.id,\n",
|
||||
" query=x,\n",
|
||||
" max_num_results=2\n",
|
||||
" )\n",
|
||||
" )\n",
|
||||
"\n",
|
||||
"chain = (\n",
|
||||
" {\"context\": vector_step | RunnableLambda(join_docs), \"question\": RunnablePassthrough()}\n",
|
||||
|
@ -829,15 +636,17 @@
|
|||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"execution_count": 12,
|
||||
"id": "03322188-9509-446a-a4a8-ce3bb83ec87c",
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"scrolled": true
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/vector-io/query \"HTTP/1.1 200 OK\"\n",
|
||||
"INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/openai/v1/vector_stores/vs_d9fa50e8-3ccc-4db3-9102-1a71cd62ab64/search \"HTTP/1.1 200 OK\"\n",
|
||||
"INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/openai/v1/chat/completions \"HTTP/1.1 200 OK\"\n"
|
||||
]
|
||||
},
|
||||
|
@ -846,7 +655,7 @@
|
|||
"output_type": "stream",
|
||||
"text": [
|
||||
"❓ How long does shipping take?\n",
|
||||
"💡 According to the shipping policy, Acme ships globally in 3-5 business days. This means that the shipping time is typically between 3 and 5 business days, but it may vary depending on the specific location and other factors.\n"
|
||||
"💡 According to the provided context, Acme ships globally in 3-5 business days.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
|
@ -867,7 +676,7 @@
|
|||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 15,
|
||||
"execution_count": 13,
|
||||
"id": "61995550-bb0b-46a8-a5d0-023207475d60",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
|
@ -875,7 +684,7 @@
|
|||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/vector-io/query \"HTTP/1.1 200 OK\"\n",
|
||||
"INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/openai/v1/vector_stores/vs_d9fa50e8-3ccc-4db3-9102-1a71cd62ab64/search \"HTTP/1.1 200 OK\"\n",
|
||||
"INFO:httpx:HTTP Request: POST http://0.0.0.0:8321/v1/openai/v1/chat/completions \"HTTP/1.1 200 OK\"\n"
|
||||
]
|
||||
},
|
||||
|
@ -884,7 +693,7 @@
|
|||
"output_type": "stream",
|
||||
"text": [
|
||||
"❓ Can I return a product after 40 days?\n",
|
||||
"💡 Based on the provided returns policy, it appears that returns are only accepted within 30 days of purchase. Since you're asking about returning a product after 40 days, it would not be within the specified 30-day return window. Unfortunately, it's unlikely that you would be able to return the product after 40 days.\n"
|
||||
"💡 Based on the given context, returns are accepted within 30 days of purchase. Therefore, after 40 days, returning a product is not accepted according to the given policy.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
|
@ -903,13 +712,13 @@
|
|||
"---\n",
|
||||
"We have successfully built a RAG system that combines:\n",
|
||||
"\n",
|
||||
"- **LlamaStack** for infrastructure (LLM serving + vector database)\n",
|
||||
"- **LlamaStack** for infrastructure (LLM serving + Vector Store)\n",
|
||||
"- **LangChain** for orchestration (prompts + chains)\n",
|
||||
"- **Together AI** for high-quality language models\n",
|
||||
"\n",
|
||||
"### Key Benefits\n",
|
||||
"\n",
|
||||
"1. **Unified Infrastructure**: Single server for LLMs and vector databases\n",
|
||||
"1. **Unified Infrastructure**: Single server for LLMs and Vector Store\n",
|
||||
"2. **OpenAI Compatibility**: Easy integration with existing LangChain code\n",
|
||||
"3. **Multi-Provider Support**: Switch between different LLM providers\n",
|
||||
"4. **Production Ready**: Built-in safety shields and monitoring\n",
|
||||
|
@ -932,6 +741,16 @@
|
|||
"kill_llama_stack_server()\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"id": "15647c46-22ce-4698-af3f-8161329d8e3a",
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"kill_llama_stack_server()"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue