docs: update documentation links (#3459)

# What does this PR do? * Updates documentation links from readthedocs to llamastack.github.io ## Test Plan * Manual testing
2025-12-03 09:53:45 +00:00 · 2025-09-17 10:37:35 -07:00 · 2025-09-17 10:37:35 -07:00 · 9fe8097ca4
commit 9fe8097ca4
parent 9acf49753e
21 changed files with 997 additions and 993 deletions
--- a/.github/ISSUE_TEMPLATE/config.yml
+++ b/.github/ISSUE_TEMPLATE/config.yml
@ -2,10 +2,10 @@ blank_issues_enabled: false
 contact_links:
  - name: Have you read the docs?
-    url: https://llama-stack.readthedocs.io/en/latest/index.html
+    url: https://llamastack.github.io/latest/providers/external/index.html
    about: Much help can be found in the docs
  - name: Start a discussion
-    url: https://github.com/meta-llama/llama-stack/discussions/new
+    url: https://github.com/llamastack/llama-stack/discussions/new/
    about: Start a discussion on a topic
  - name: Chat on Discord
    url: https://discord.gg/llama-stack
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -187,7 +187,7 @@ Note that the provider "description" field will be used to generate the provider
 ### Building the Documentation
-If you are making changes to the documentation at [https://llama-stack.readthedocs.io/en/latest/](https://llama-stack.readthedocs.io/en/latest/), you can use the following command to build the documentation and preview your changes. You will need [Sphinx](https://www.sphinx-doc.org/en/master/) and the readthedocs theme.
+If you are making changes to the documentation at [https://llamastack.github.io/latest/](https://llamastack.github.io/latest/), you can use the following command to build the documentation and preview your changes. You will need [Sphinx](https://www.sphinx-doc.org/en/master/) and the readthedocs theme.
 ```bash
 # This rebuilds the documentation pages.
@ -205,4 +205,4 @@ If you modify or add new API endpoints, update the API documentation accordingly
 uv run ./docs/openapi_generator/run_openapi_generator.sh
 ```
-The generated API documentation will be available in `docs/_static/`. Make sure to review the changes before committing.
+The generated API documentation will be available in `docs/_static/`. Make sure to review the changes before committing.
--- a/README.md
+++ b/README.md
@ -7,7 +7,7 @@
 [![Unit Tests](https://github.com/meta-llama/llama-stack/actions/workflows/unit-tests.yml/badge.svg?branch=main)](https://github.com/meta-llama/llama-stack/actions/workflows/unit-tests.yml?query=branch%3Amain)
 [![Integration Tests](https://github.com/meta-llama/llama-stack/actions/workflows/integration-tests.yml/badge.svg?branch=main)](https://github.com/meta-llama/llama-stack/actions/workflows/integration-tests.yml?query=branch%3Amain)
-[**Quick Start**](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html) | [**Documentation**](https://llama-stack.readthedocs.io/en/latest/index.html) | [**Colab Notebook**](./docs/getting_started.ipynb) | [**Discord**](https://discord.gg/llama-stack)
+[**Quick Start**](https://llamastack.github.io/latest/getting_started/index.html) | [**Documentation**](https://llamastack.github.io/latest/index.html) | [**Colab Notebook**](./docs/getting_started.ipynb) | [**Discord**](https://discord.gg/llama-stack)
 ### ✨🎉 Llama 4 Support  🎉✨
@ -109,7 +109,7 @@ By reducing friction and complexity, Llama Stack empowers developers to focus on
 ### API Providers
 Here is a list of the various API providers and available distributions that can help developers get started easily with Llama Stack.
-Please checkout for [full list](https://llama-stack.readthedocs.io/en/latest/providers/index.html)
+Please checkout for [full list](https://llamastack.github.io/latest/providers/index.html)
 | API Provider Builder | Environments | Agents | Inference | VectorIO | Safety | Telemetry | Post Training | Eval | DatasetIO |
 |:--------------------:|:------------:|:------:|:---------:|:--------:|:------:|:---------:|:-------------:|:----:|:--------:|
@ -140,7 +140,7 @@ Please checkout for [full list](https://llama-stack.readthedocs.io/en/latest/pro
 |     NVIDIA NEMO      | Hosted | | ✅ | ✅ | | | ✅ | ✅ | ✅ |
 |        NVIDIA        | Hosted | | | | | | ✅ | ✅ | ✅ |
-> **Note**: Additional providers are available through external packages. See [External Providers](https://llama-stack.readthedocs.io/en/latest/providers/external.html) documentation.
+> **Note**: Additional providers are available through external packages. See [External Providers](https://llamastack.github.io/latest/providers/external/index.html) documentation.
 ### Distributions
@ -149,24 +149,24 @@ Here are some of the distributions we support:
 |               **Distribution**                |                                                                    **Llama Stack Docker**                                                                     |                                                 Start This Distribution                                                  |
 |:---------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------:|
-|                Starter Distribution                 |           [llamastack/distribution-starter](https://hub.docker.com/repository/docker/llamastack/distribution-starter/general)           |      [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/starter.html)      |
+|                Starter Distribution                 |           [llamastack/distribution-starter](https://hub.docker.com/repository/docker/llamastack/distribution-starter/general)           |      [Guide](https://llamastack.github.io/latest/distributions/self_hosted_distro/starter.html)      |
-|                Meta Reference                 |           [llamastack/distribution-meta-reference-gpu](https://hub.docker.com/repository/docker/llamastack/distribution-meta-reference-gpu/general)           |      [Guide](https://llama-stack.readthedocs.io/en/latest/distributions/self_hosted_distro/meta-reference-gpu.html)      |
+|                Meta Reference                 |           [llamastack/distribution-meta-reference-gpu](https://hub.docker.com/repository/docker/llamastack/distribution-meta-reference-gpu/general)           |      [Guide](https://llamastack.github.io/latest/distributions/self_hosted_distro/meta-reference-gpu.html)      |
 |                   PostgreSQL                  |                [llamastack/distribution-postgres-demo](https://hub.docker.com/repository/docker/llamastack/distribution-postgres-demo/general)                |                  |
 ### Documentation
-Please checkout our [Documentation](https://llama-stack.readthedocs.io/en/latest/index.html) page for more details.
+Please checkout our [Documentation](https://llamastack.github.io/latest/index.html) page for more details.
 * CLI references
-    * [llama (server-side) CLI Reference](https://llama-stack.readthedocs.io/en/latest/references/llama_cli_reference/index.html): Guide for using the `llama` CLI to work with Llama models (download, study prompts), and building/starting a Llama Stack distribution.
+    * [llama (server-side) CLI Reference](https://llamastack.github.io/latest/references/llama_cli_reference/index.html): Guide for using the `llama` CLI to work with Llama models (download, study prompts), and building/starting a Llama Stack distribution.
-    * [llama (client-side) CLI Reference](https://llama-stack.readthedocs.io/en/latest/references/llama_stack_client_cli_reference.html): Guide for using the `llama-stack-client` CLI, which allows you to query information about the distribution.
+    * [llama (client-side) CLI Reference](https://llamastack.github.io/latest/references/llama_stack_client_cli_reference.html): Guide for using the `llama-stack-client` CLI, which allows you to query information about the distribution.
 * Getting Started
-    * [Quick guide to start a Llama Stack server](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html).
+    * [Quick guide to start a Llama Stack server](https://llamastack.github.io/latest/getting_started/index.html).
    * [Jupyter notebook](./docs/getting_started.ipynb) to walk-through how to use simple text and vision inference llama_stack_client APIs
    * The complete Llama Stack lesson [Colab notebook](https://colab.research.google.com/drive/1dtVmxotBsI4cGZQNsJRYPrLiDeT0Wnwt) of the new [Llama 3.2 course on Deeplearning.ai](https://learn.deeplearning.ai/courses/introducing-multimodal-llama-3-2/lesson/8/llama-stack).
    * A [Zero-to-Hero Guide](https://github.com/meta-llama/llama-stack/tree/main/docs/zero_to_hero_guide) that guide you through all the key components of llama stack with code samples.
 * [Contributing](CONTRIBUTING.md)
-    * [Adding a new API Provider](https://llama-stack.readthedocs.io/en/latest/contributing/new_api_provider.html) to walk-through how to add a new API provider.
+    * [Adding a new API Provider](https://llamastack.github.io/latest/contributing/new_api_provider.html) to walk-through how to add a new API provider.
 ### Llama Stack Client SDKs
@ -193,4 +193,4 @@ Thanks to all of our amazing contributors!
 <a href="https://github.com/meta-llama/llama-stack/graphs/contributors">
  <img src="https://contrib.rocks/image?repo=meta-llama/llama-stack" />
-</a>
+</a>
--- a/docs/README.md
+++ b/docs/README.md
@ -1,6 +1,6 @@
 # Llama Stack Documentation
-Here's a collection of comprehensive guides, examples, and resources for building AI applications with Llama Stack. For the complete documentation, visit our [ReadTheDocs page](https://llama-stack.readthedocs.io/en/latest/index.html).
+Here's a collection of comprehensive guides, examples, and resources for building AI applications with Llama Stack. For the complete documentation, visit our [Github page](https://llamastack.github.io/latest/getting_started/index.html).
 ## Render locally
--- a/docs/getting_started.ipynb
+++ b/docs/getting_started.ipynb
@ -11,11 +11,11 @@
        "\n",
        "# Llama Stack - Building AI Applications\n",
        "\n",
-        "<img src=\"https://llama-stack.readthedocs.io/en/latest/_images/llama-stack.png\" alt=\"drawing\" width=\"500\"/>\n",
+        "<img src=\"https://llamastack.github.io/latest/_images/llama-stack.png\" alt=\"drawing\" width=\"500\"/>\n",
        "\n",
        "[Llama Stack](https://github.com/meta-llama/llama-stack) defines and standardizes the set of core building blocks needed to bring generative AI applications to market. These building blocks are presented in the form of interoperable APIs with a broad set of Service Providers providing their implementations.\n",
        "\n",
-        "Read more about the project here: https://llama-stack.readthedocs.io/en/latest/index.html\n",
+        "Read more about the project here: https://llamastack.github.io/latest/getting_started/index.html\n",
        "\n",
        "In this guide, we will showcase how you can build LLM-powered agentic applications using Llama Stack.\n",
        "\n",
@ -75,7 +75,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 1,
+      "execution_count": null,
      "id": "J2kGed0R5PSf",
      "metadata": {
        "colab": {
@ -113,17 +113,17 @@
        }
      ],
      "source": [
-        "import os \n",
+        "import os\n",
        "import subprocess\n",
        "import time\n",
        "\n",
-        "!pip install uv \n",
+        "!pip install uv\n",
        "\n",
        "if \"UV_SYSTEM_PYTHON\" in os.environ:\n",
        "  del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
        "\n",
        "# this command installs all the dependencies needed for the llama stack server with the together inference provider\n",
-        "!uv run --with llama-stack llama stack build --distro together --image-type venv \n",
+        "!uv run --with llama-stack llama stack build --distro together --image-type venv\n",
        "\n",
        "def run_llama_stack_server_background():\n",
        "    log_file = open(\"llama_stack_server.log\", \"w\")\n",
@ -134,7 +134,7 @@
        "        stderr=log_file,\n",
        "        text=True\n",
        "    )\n",
-        "    \n",
+        "\n",
        "    print(f\"Starting Llama Stack server with PID: {process.pid}\")\n",
        "    return process\n",
        "\n",
@ -142,11 +142,11 @@
        "    import requests\n",
        "    from requests.exceptions import ConnectionError\n",
        "    import time\n",
-        "    \n",
+        "\n",
        "    url = \"http://0.0.0.0:8321/v1/health\"\n",
        "    max_retries = 30\n",
        "    retry_interval = 1\n",
-        "    \n",
+        "\n",
        "    print(\"Waiting for server to start\", end=\"\")\n",
        "    for _ in range(max_retries):\n",
        "        try:\n",
@ -157,12 +157,12 @@
        "        except ConnectionError:\n",
        "            print(\".\", end=\"\", flush=True)\n",
        "            time.sleep(retry_interval)\n",
-        "            \n",
+        "\n",
        "    print(\"\\nServer failed to start after\", max_retries * retry_interval, \"seconds\")\n",
        "    return False\n",
        "\n",
        "\n",
-        "# use this helper if needed to kill the server \n",
+        "# use this helper if needed to kill the server\n",
        "def kill_llama_stack_server():\n",
        "    # Kill any existing llama stack server processes\n",
        "    os.system(\"ps aux | grep -v grep | grep llama_stack.core.server.server | awk '{print $2}' | xargs kill -9\")\n"
@ -242,7 +242,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 4,
+      "execution_count": null,
      "id": "E1UFuJC570Tk",
      "metadata": {
        "colab": {
@ -407,9 +407,9 @@
        "from llama_stack_client import LlamaStackClient\n",
        "\n",
        "client = LlamaStackClient(\n",
-        "    base_url=\"http://0.0.0.0:8321\", \n",
+        "    base_url=\"http://0.0.0.0:8321\",\n",
        "    provider_data = {\n",
-        "        \"tavily_search_api_key\": os.environ['TAVILY_SEARCH_API_KEY'], \n",
+        "        \"tavily_search_api_key\": os.environ['TAVILY_SEARCH_API_KEY'],\n",
        "        \"together_api_key\": os.environ['TOGETHER_API_KEY']\n",
        "    }\n",
        ")"
@ -1177,7 +1177,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 13,
+      "execution_count": null,
      "id": "WS8Gu5b0APHs",
      "metadata": {
        "colab": {
@ -1207,7 +1207,7 @@
        "from termcolor import cprint\n",
        "\n",
        "agent = Agent(\n",
-        "    client, \n",
+        "    client,\n",
        "    model=\"meta-llama/Llama-3.3-70B-Instruct\",\n",
        "    instructions=\"You are a helpful assistant. Use websearch tool to help answer questions.\",\n",
        "    tools=[\"builtin::websearch\"],\n",
@ -1249,7 +1249,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 14,
+      "execution_count": null,
      "id": "GvLWltzZCNkg",
      "metadata": {
        "colab": {
@ -1367,7 +1367,7 @@
        "    chunk_size_in_tokens=512,\n",
        ")\n",
        "rag_agent = Agent(\n",
-        "    client, \n",
+        "    client,\n",
        "    model=model_id,\n",
        "    instructions=\"You are a helpful assistant\",\n",
        "    tools = [\n",
@ -2154,7 +2154,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 21,
+      "execution_count": null,
      "id": "vttLbj_YO01f",
      "metadata": {
        "colab": {
@ -2217,7 +2217,7 @@
        "from termcolor import cprint\n",
        "\n",
        "agent = Agent(\n",
-        "    client, \n",
+        "    client,\n",
        "    model=model_id,\n",
        "    instructions=\"You are a helpful assistant\",\n",
        "    tools=[\"mcp::filesystem\"],\n",
@ -2283,7 +2283,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 22,
+      "execution_count": null,
      "id": "4iCO59kP20Zs",
      "metadata": {
        "colab": {
@ -2317,7 +2317,7 @@
        "from llama_stack_client import Agent, AgentEventLogger\n",
        "\n",
        "agent = Agent(\n",
-        "    client, \n",
+        "    client,\n",
        "    model=\"meta-llama/Llama-3.3-70B-Instruct\",\n",
        "    instructions=\"You are a helpful assistant. Use web_search tool to answer the questions.\",\n",
        "    tools=[\"builtin::websearch\"],\n",
@ -2846,7 +2846,7 @@
    },
    {
      "cell_type": "code",
-      "execution_count": 29,
+      "execution_count": null,
      "id": "44e05e16",
      "metadata": {},
      "outputs": [
@ -2880,8 +2880,7 @@
        "!curl -O https://raw.githubusercontent.com/meta-llama/llama-models/refs/heads/main/Llama_Repo.jpeg\n",
        "\n",
        "from IPython.display import Image\n",
-        "Image(\"Llama_Repo.jpeg\", width=256, height=256)\n",
+        "Image(\"Llama_Repo.jpeg\", width=256, height=256)\n"
        "\n"
      ]
    },
    {
--- a/docs/getting_started_llama4.ipynb
+++ b/docs/getting_started_llama4.ipynb
@ -11,11 +11,11 @@
        "\n",
        "# Getting Started with Llama 4 in Llama Stack\n",
        "\n",
-        "<img src=\"https://llama-stack.readthedocs.io/en/latest/_images/llama-stack.png\" alt=\"drawing\" width=\"500\"/>\n",
+        "<img src=\"https://llamastack.github.io/latest/_images/llama-stack.png\" alt=\"drawing\" width=\"500\"/>\n",
        "\n",
        "[Llama Stack](https://github.com/meta-llama/llama-stack) defines and standardizes the set of core building blocks needed to bring generative AI applications to market. These building blocks are presented in the form of interoperable APIs with a broad set of Service Providers providing their implementations.\n",
        "\n",
-        "Read more about the project here: https://llama-stack.readthedocs.io/en/latest/index.html\n",
+        "Read more about the project here: https://llamastack.github.io/latest/index.html\n",
        "\n",
        "In this guide, we will showcase how you can get started with using Llama 4 in Llama Stack.\n",
        "\n",
@ -51,7 +51,7 @@
      "metadata": {},
      "outputs": [],
      "source": [
-        "!pip install uv \n",
+        "!pip install uv\n",
        "\n",
        "MODEL=\"Llama-4-Scout-17B-16E-Instruct\"\n",
        "# get meta url from llama.com\n",
@ -223,7 +223,7 @@
        }
      ],
      "source": [
-        "import os \n",
+        "import os\n",
        "import subprocess\n",
        "import time\n",
        "\n",
@ -232,8 +232,8 @@
        "if \"UV_SYSTEM_PYTHON\" in os.environ:\n",
        "  del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
        "\n",
-        "# this command installs all the dependencies needed for the llama stack server \n",
+        "# this command installs all the dependencies needed for the llama stack server\n",
-        "!uv run --with llama-stack llama stack build --distro meta-reference-gpu --image-type venv \n",
+        "!uv run --with llama-stack llama stack build --distro meta-reference-gpu --image-type venv\n",
        "\n",
        "def run_llama_stack_server_background():\n",
        "    log_file = open(\"llama_stack_server.log\", \"w\")\n",
@ -244,7 +244,7 @@
        "        stderr=log_file,\n",
        "        text=True\n",
        "    )\n",
-        "    \n",
+        "\n",
        "    print(f\"Starting Llama Stack server with PID: {process.pid}\")\n",
        "    return process\n",
        "\n",
@ -252,11 +252,11 @@
        "    import requests\n",
        "    from requests.exceptions import ConnectionError\n",
        "    import time\n",
-        "    \n",
+        "\n",
        "    url = \"http://0.0.0.0:8321/v1/health\"\n",
        "    max_retries = 30\n",
        "    retry_interval = 1\n",
-        "    \n",
+        "\n",
        "    print(\"Waiting for server to start\", end=\"\")\n",
        "    for _ in range(max_retries):\n",
        "        try:\n",
@ -267,12 +267,12 @@
        "        except ConnectionError:\n",
        "            print(\".\", end=\"\", flush=True)\n",
        "            time.sleep(retry_interval)\n",
-        "            \n",
+        "\n",
        "    print(\"\\nServer failed to start after\", max_retries * retry_interval, \"seconds\")\n",
        "    return False\n",
        "\n",
        "\n",
-        "# use this helper if needed to kill the server \n",
+        "# use this helper if needed to kill the server\n",
        "def kill_llama_stack_server():\n",
        "    # Kill any existing llama stack server processes\n",
        "    os.system(\"ps aux | grep -v grep | grep llama_stack.core.server.server | awk '{print $2}' | xargs kill -9\")\n"
--- a/docs/getting_started_llama_api.ipynb
+++ b/docs/getting_started_llama_api.ipynb
--- a/docs/notebooks/Alpha_Llama_Stack_Post_Training.ipynb
+++ b/docs/notebooks/Alpha_Llama_Stack_Post_Training.ipynb
@ -14,7 +14,7 @@
        "We will also showcase how to leverage existing Llama stack [inference APIs](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/apis/inference/inference.py) (ollama as provider) to get the new model's output and the [eval APIs](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/apis/eval/eval.py) to help you better measure the new model performance. We hope the flywheel of post-training -> eval -> inference can greatly empower agentic apps development.\n",
        "\n",
        "\n",
-        "- Read more about Llama Stack: https://llama-stack.readthedocs.io/en/latest/introduction/index.html\n",
+        "- Read more about Llama Stack: https://llamastack.github.io/latest/index.html\n",
        "- Read more about post training APIs definition: https://github.com/meta-llama/llama-stack/blob/main/llama_stack/apis/post_training/post_training.py\n",
        "\n",
        "\n",
@ -3632,7 +3632,7 @@
      },
      "source": [
        "#### 1.2. Kick-off eval job\n",
-        "- More details on Llama-stack eval: https://llama-stack.readthedocs.io/en/latest/benchmark_evaluations/index.html\n",
+        "- More details on Llama-stack eval: https://llamastack.github.io/latest/references/evals_reference/index.html\n",
        "  - Define an EvalCandidate\n",
        "  - Run evaluate on datasets (we choose brainstrust's answer-similarity as scoring function with OpenAI's model as judge model)\n",
        "\n",
--- a/docs/notebooks/Llama_Stack_Benchmark_Evals.ipynb
+++ b/docs/notebooks/Llama_Stack_Benchmark_Evals.ipynb
@ -12,7 +12,7 @@
        "\n",
        "This notebook will walk you through the main sets of APIs we offer with Llama Stack for supporting running benchmark evaluations of your with working examples to explore the possibilities that Llama Stack opens up for you.\n",
        "\n",
-        "Read more about Llama Stack: https://llama-stack.readthedocs.io/en/latest/index.html"
+        "Read more about Llama Stack: https://llamastack.github.io/latest/index.html"
      ]
    },
    {
--- a/docs/quick_start.ipynb
+++ b/docs/quick_start.ipynb
@ -11,7 +11,7 @@
        "\n",
        "# Llama Stack - Building AI Applications\n",
        "\n",
-        "<img src=\"https://llama-stack.readthedocs.io/en/latest/_images/llama-stack.png\" alt=\"drawing\" width=\"500\"/>\n",
+        "<img src=\"https://llamastack.github.io/latest/_images/llama-stack.png\" alt=\"drawing\" width=\"500\"/>\n",
        "\n",
        "Get started with Llama Stack in minutes!\n",
        "\n",
@ -138,7 +138,7 @@
      },
      "outputs": [],
      "source": [
-        "import os \n",
+        "import os\n",
        "import subprocess\n",
        "\n",
        "if \"UV_SYSTEM_PYTHON\" in os.environ:\n",
@ -150,13 +150,13 @@
        "def run_llama_stack_server_background():\n",
        "    log_file = open(\"llama_stack_server.log\", \"w\")\n",
        "    process = subprocess.Popen(\n",
-        "        f\"OLLAMA_URL=http://localhost:11434 uv run --with llama-stack llama stack run starter --image-type venv",
+        "        f\"OLLAMA_URL=http://localhost:11434 uv run --with llama-stack llama stack run starter --image-type venv\n",
        "        shell=True,\n",
        "        stdout=log_file,\n",
        "        stderr=log_file,\n",
        "        text=True\n",
        "    )\n",
-        "    \n",
+        "\n",
        "    print(f\"Starting Llama Stack server with PID: {process.pid}\")\n",
        "    return process\n",
        "\n",
@ -164,11 +164,11 @@
        "    import requests\n",
        "    from requests.exceptions import ConnectionError\n",
        "    import time\n",
-        "    \n",
+        "\n",
        "    url = \"http://0.0.0.0:8321/v1/health\"\n",
        "    max_retries = 30\n",
        "    retry_interval = 1\n",
-        "    \n",
+        "\n",
        "    print(\"Waiting for server to start\", end=\"\")\n",
        "    for _ in range(max_retries):\n",
        "        try:\n",
@ -179,12 +179,12 @@
        "        except ConnectionError:\n",
        "            print(\".\", end=\"\", flush=True)\n",
        "            time.sleep(retry_interval)\n",
-        "            \n",
+        "\n",
        "    print(\"\\nServer failed to start after\", max_retries * retry_interval, \"seconds\")\n",
        "    return False\n",
        "\n",
        "\n",
-        "# use this helper if needed to kill the server \n",
+        "# use this helper if needed to kill the server\n",
        "def kill_llama_stack_server():\n",
        "    # Kill any existing llama stack server processes\n",
        "    os.system(\"ps aux | grep -v grep | grep llama_stack.core.server.server | awk '{print $2}' | xargs kill -9\")\n"
--- a/docs/zero_to_hero_guide/00_Inference101.ipynb
+++ b/docs/zero_to_hero_guide/00_Inference101.ipynb
@ -9,7 +9,7 @@
        "\n",
        "This document provides instructions on how to use Llama Stack's `chat_completion` function for generating text using the `Llama3.2-3B-Instruct` model. \n",
        "\n",
-        "Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html).\n",
+        "Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llamastack.github.io/latest/getting_started/index.html).\n",
        "\n",
        "\n",
        "### Table of Contents\n",
--- a/docs/zero_to_hero_guide/01_Local_Cloud_Inference101.ipynb
+++ b/docs/zero_to_hero_guide/01_Local_Cloud_Inference101.ipynb
@ -10,7 +10,7 @@
    "This guide provides a streamlined setup to switch between local and cloud clients for text generation with Llama Stack’s `chat_completion` API. This setup enables automatic fallback to a cloud instance if the local client is unavailable.\n",
    "\n",
    "### Prerequisites\n",
-    "Before you begin, please ensure Llama Stack is installed and the distribution is set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/). You will need to run two distributions, a local and a cloud distribution, for this demo to work.\n",
+    "Before you begin, please ensure Llama Stack is installed and the distribution is set up by following the [Getting Started Guide](https://llamastack.github.io/latest/getting_started/index.html). You will need to run two distributions, a local and a cloud distribution, for this demo to work.\n",
    "\n",
    "### Implementation"
   ]
--- a/docs/zero_to_hero_guide/02_Prompt_Engineering101.ipynb
+++ b/docs/zero_to_hero_guide/02_Prompt_Engineering101.ipynb
@ -11,7 +11,7 @@
        "\n",
        "This interactive guide covers prompt engineering & best practices with Llama 3.2 and Llama Stack.\n",
        "\n",
-        "Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html)."
+        "Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llamastack.github.io/latest/getting_started/index.html)."
      ]
    },
    {
--- a/docs/zero_to_hero_guide/03_Image_Chat101.ipynb
+++ b/docs/zero_to_hero_guide/03_Image_Chat101.ipynb
@ -7,7 +7,7 @@
      "source": [
        "## Getting Started with LlamaStack Vision API\n",
        "\n",
-        "Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html).\n",
+        "Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llamastack.github.io/latest/getting_started/index.html).\n",
        "\n",
        "Let's import the necessary packages"
      ]
--- a/docs/zero_to_hero_guide/05_Memory101.ipynb
+++ b/docs/zero_to_hero_guide/05_Memory101.ipynb
@ -26,7 +26,7 @@
        "A running instance of the Llama Stack server (we'll use localhost in \n",
        "this tutorial)\n",
        "\n",
-        "Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html).\n",
+        "Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llamastack.github.io/latest/getting_started/index.html).\n",
        "\n",
        "Let's start by installing the required packages:"
      ]
@ -268,7 +268,7 @@
        "    # Split document content into chunks of 512 characters\n",
        "    content = doc.content\n",
        "    chunk_size = 512\n",
-        "    \n",
+        "\n",
        "    # Create chunks of the specified size\n",
        "    for i in range(0, len(content), chunk_size):\n",
        "        chunk_content = content[i:i+chunk_size]\n",
--- a/docs/zero_to_hero_guide/06_Safety101.ipynb
+++ b/docs/zero_to_hero_guide/06_Safety101.ipynb
@ -6,7 +6,7 @@
      "source": [
        "## Safety API 101\n",
        "\n",
-        "This document talks about the Safety APIs in Llama Stack. Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html).\n",
+        "This document talks about the Safety APIs in Llama Stack. Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llamastack.github.io/latest/getting_started/index.html).\n",
        "\n",
        "As outlined in our [Responsible Use Guide](https://www.llama.com/docs/how-to-guides/responsible-use-guide-resources/), LLM apps should deploy appropriate system level safeguards to mitigate safety and security risks of LLM system, similar to the following diagram:\n",
        "\n",
--- a/docs/zero_to_hero_guide/07_Agents101.ipynb
+++ b/docs/zero_to_hero_guide/07_Agents101.ipynb
@ -6,7 +6,7 @@
      "source": [
        "## Agentic API 101\n",
        "\n",
-        "This document talks about the Agentic APIs in Llama Stack. Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html).\n",
+        "This document talks about the Agentic APIs in Llama Stack. Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llamastack.github.io/latest/getting_started/index.html).\n",
        "\n",
        "Starting Llama 3.1 you can build agentic applications capable of:\n",
        "\n",
--- a/docs/zero_to_hero_guide/README.md
+++ b/docs/zero_to_hero_guide/README.md
@ -9,13 +9,18 @@ If you're looking for more specific topics, we have a [Zero to Hero Guide](#next
 > If you'd prefer not to set up a local server, explore our notebook on [tool calling with the Together API](Tool_Calling101_Using_Together_Llama_Stack_Server.ipynb). This notebook will show you how to leverage together.ai's Llama Stack Server API, allowing you to get started with Llama Stack without the need for a locally built and running server.
 ## Table of Contents
-1. [Setup and run ollama](#setup-ollama)
+- [Llama Stack: from Zero to Hero](#llama-stack-from-zero-to-hero)
-2. [Install Dependencies and Set Up Environment](#install-dependencies-and-set-up-environment)
+  - [Table of Contents](#table-of-contents)
-3. [Build, Configure, and Run Llama Stack](#build-configure-and-run-llama-stack)
+  - [Setup ollama](#setup-ollama)
-4. [Test with llama-stack-client CLI](#test-with-llama-stack-client-cli)
+  - [Install Dependencies and Set Up Environment](#install-dependencies-and-set-up-environment)
-5. [Test with curl](#test-with-curl)
+  - [Build, Configure, and Run Llama Stack](#build-configure-and-run-llama-stack)
-6. [Test with Python](#test-with-python)
+  - [Test with `llama-stack-client` CLI](#test-with-llama-stack-client-cli)
-7. [Next Steps](#next-steps)
+  - [Test with `curl`](#test-with-curl)
  - [Test with Python](#test-with-python)
    - [1. Create Python Script (`test_llama_stack.py`)](#1-create-python-script-test_llama_stackpy)
    - [2. Create a Chat Completion Request in Python](#2-create-a-chat-completion-request-in-python)
    - [3. Run the Python Script](#3-run-the-python-script)
  - [Next Steps](#next-steps)
 ---
@ -242,7 +247,7 @@ This command initializes the model to interact with your local Llama Stack insta
 ## Next Steps
 **Explore Other Guides**: Dive deeper into specific topics by following these guides:
- [Understanding Distribution](https://llama-stack.readthedocs.io/en/latest/concepts/index.html#distributions)
+- [Understanding Distribution](https://llamastack.github.io/latest/concepts/index.html#distributions)
 - [Inference 101](00_Inference101.ipynb)
 - [Local and Cloud Model Toggling 101](01_Local_Cloud_Inference101.ipynb)
 - [Prompt Engineering](02_Prompt_Engineering101.ipynb)
@ -259,7 +264,7 @@ This command initializes the model to interact with your local Llama Stack insta
  - [Swift SDK](https://github.com/meta-llama/llama-stack-client-swift)
  - [Kotlin SDK](https://github.com/meta-llama/llama-stack-client-kotlin)
-**Advanced Configuration**: Learn how to customize your Llama Stack distribution by referring to the [Building a Llama Stack Distribution](https://llama-stack.readthedocs.io/en/latest/distributions/building_distro.html) guide.
+**Advanced Configuration**: Learn how to customize your Llama Stack distribution by referring to the [Building a Llama Stack Distribution](https://llamastack.github.io/latest/distributions/building_distro.html) guide.
 **Explore Example Apps**: Check out [llama-stack-apps](https://github.com/meta-llama/llama-stack-apps/tree/main/examples) for example applications built using Llama Stack.
--- a/llama_stack/core/start_stack.sh
+++ b/llama_stack/core/start_stack.sh
@ -123,6 +123,6 @@ if [[ "$env_type" == "venv" ]]; then
    $other_args
 elif [[ "$env_type" == "container" ]]; then
    echo -e "${RED}Warning: Llama Stack no longer supports running Containers via the 'llama stack run' command.${NC}"
-    echo -e "Please refer to the documentation for more information: https://llama-stack.readthedocs.io/en/latest/distributions/building_distro.html#llama-stack-build"
+    echo -e "Please refer to the documentation for more information: https://llamastack.github.io/latest/distributions/building_distro.html#llama-stack-build"
    exit 1
 fi
--- a/llama_stack/core/ui/README.md
+++ b/llama_stack/core/ui/README.md
@ -6,7 +6,7 @@
 ## Developer Setup
-1. Start up Llama Stack API server. More details [here](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html).
+1. Start up Llama Stack API server. More details [here](https://llamastack.github.io/latest/getting_started/index.htmll).
 ```
 llama stack build --distro together --image-type venv
--- a/scripts/install.sh
+++ b/scripts/install.sh
@ -92,11 +92,11 @@ Options:
    -h, --help                 Show this help message
 For more information:
-    Documentation: https://llama-stack.readthedocs.io/
+    Documentation: https://llamastack.github.io/latest/
-    GitHub: https://github.com/meta-llama/llama-stack
+    GitHub: https://github.com/llamastack/llama-stack
 Report issues:
-    https://github.com/meta-llama/llama-stack/issues
+    https://github.com/llamastack/llama-stack/issues
 EOF
 }
@ -241,8 +241,8 @@ fi
 log ""
 log "🎉 Llama Stack is ready!"
 log "👉  API endpoint: http://localhost:${PORT}"
-log "📖 Documentation: https://llama-stack.readthedocs.io/en/latest/references/index.html"
+log "📖 Documentation: https://llamastack.github.io/latest/references/api_reference/index.html"
 log "💻 To access the llama stack CLI, exec into the container:"
 log "   $ENGINE exec -ti llama-stack bash"
-log "🐛 Report an issue @ https://github.com/meta-llama/llama-stack/issues if you think it's a bug"
+log "🐛 Report an issue @ https://github.com/llamastack/llama-stack/issues if you think it's a bug"
 log ""