From cdd486d58c046606b62b8603e35044e716b027f1 Mon Sep 17 00:00:00 2001
From: Kai Wu <kaiwu@meta.com>
Date: Mon, 29 Sep 2025 10:25:37 -0700
Subject: [PATCH] update rag.mdx

---
 docs/docs/building_applications/rag.mdx | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/docs/docs/building_applications/rag.mdx b/docs/docs/building_applications/rag.mdx
index 5c864a9fb..cb7f941e9 100644
--- a/docs/docs/building_applications/rag.mdx
+++ b/docs/docs/building_applications/rag.mdx
@@ -24,7 +24,19 @@ This new approach provides better compatibility with OpenAI's ecosystem and is t
 ## Prerequisites
 
 For this guide, we will use [Ollama](https://ollama.com/) as the inference provider.
-Ollama is an LLM runtime that allows you to run Llama models locally.
+Ollama is an LLM runtime that allows you to run Llama models locally. It's a great choice for development and testing, but you can also use any other inference provider that supports the OpenAI API.
+
+Before you begin, make sure you have the following:
+1. **Ollama**: Follow the [installation guide](https://ollama.com/docs/ollama/getting-started/install
+) to set up Ollama on your machine.
+2. **Llama Stack**: Follow the [installation guide](/docs/installation) to set up Llama Stack on your
+machine.
+3. **Documents**: Prepare a set of documents that you want to search. These can be plain text, PDFs, or other file types.
+4. Set the `LLAMA_STACK_PORT` environment variable to the port where Llama Stack is running. For example, if you are using the default port of 8321, set `export LLAMA_STACK_PORT=8321`. Also set 'OLLAMA_URL' environment variable to be 'http://localhost:11434'
+
+## Step 0: Initialize Client
+
+After lauched Llama Stack server by `llama stack build --distro starter --image-type venv --run`, initialize the client with the base URL of your Llama Stack instance.
 
 ```python
 import os