diff --git a/docs/notebooks/nvidia/Llama_Stack_NVIDIA_E2E_Flow.ipynb b/docs/notebooks/nvidia/Llama_Stack_NVIDIA_E2E_Flow.ipynb
index 17d370ce3..b3b8daf15 100644
--- a/docs/notebooks/nvidia/Llama_Stack_NVIDIA_E2E_Flow.ipynb
+++ b/docs/notebooks/nvidia/Llama_Stack_NVIDIA_E2E_Flow.ipynb
@@ -4,7 +4,23 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "This notebook contains Llama Stack implementation of a common end-to-end workflow for customizing and evaluating LLMs using the NVIDIA provider."
+    "## NVIDIA E2E Flow"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This notebook contains a Llama Stack implementation for an end-to-end workflow for running inference, customizing, and evaluating LLMs using the NVIDIA provider.\n",
+    "\n",
+    "The NVIDIA provider leverages the NeMo Microservices platform, a collection of microservices that you can use to build AI workflows on your Kubernetes cluster on-prem or in cloud.\n",
+    "\n",
+    "This notebook covers the following workflows:\n",
+    "- Creating a dataset and uploading files\n",
+    "- Customizing models\n",
+    "- Evaluating base and customized models, with and without guardrails\n",
+    "- Running inference on base and customized models, with and without guardrails\n",
+    "\n"
    ]
   },
   {
@@ -12,7 +28,7 @@
    "metadata": {},
    "source": [
     "## Prerequisites\n",
-    "First, ensure the NeMo Microservices platform is up and running, including the model downloading step for `meta/llama-3.2-1b-instruct`. See installation instructions: https://aire.gitlab-master-pages.nvidia.com/microservices/documentation/latest/nemo-microservices/latest-internal/set-up/deploy-as-platform/index.html (TODO: Update to public docs)"
+    "First, ensure the NeMo Microservices platform is up and running, including the model downloading step for `meta/llama-3.1-8b-instruct`. See installation instructions: https://aire.gitlab-master-pages.nvidia.com/microservices/documentation/latest/nemo-microservices/latest-internal/set-up/deploy-as-platform/index.html (TODO: Update to public docs)"
    ]
   },
   {
@@ -70,7 +86,7 @@
    "source": [
     "Configure the environment variables for each service.\n",
     "\n",
-    "If needed, update the URLs for each service to point to your deployment.\n",
+    "Ensure the URLs for each service point to your deployment.\n",
     "- NDS_URL: NeMo Data Store URL\n",
     "- NEMO_URL: NeMo Microservices Platform URL\n",
     "- NIM_URL: NIM URL\n",
@@ -94,7 +110,7 @@
     "USER_ID = \"llama-stack-user\"\n",
     "NAMESPACE = \"default\"\n",
     "PROJECT_ID = \"\"\n",
-    "CUSTOMIZED_MODEL_DIR = \"jg-test-llama-stack@v2\"\n",
+    "CUSTOMIZED_MODEL_DIR = \"test-llama-stack@v1\"\n",
     "\n",
     "# Inference env vars\n",
     "os.environ[\"NVIDIA_BASE_URL\"] = NIM_URL\n",
@@ -156,13 +172,19 @@
     "client.initialize()"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Here, we define helper functions that wait for async jobs to complete."
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 25,
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Helper functions for waiting on jobs\n",
     "from llama_stack.apis.common.job_types import JobStatus\n",
     "\n",
     "def wait_customization_job(job_id: str, polling_interval: int = 10, timeout: int = 6000):\n",
@@ -204,6 +226,8 @@
     "\n",
     "    return job_status\n",
     "\n",
+    "# When creating a customized model, NIM asynchronously loads the model in its model registry.\n",
+    "# After this, we can run inference with the new model. This helper function waits for NIM to pick up the new model.\n",
     "def wait_nim_loads_customized_model(model_id: str, namespace: str, polling_interval: int = 10, timeout: int = 300):\n",
     "    found = False\n",
     "    start_time = time()\n",