docs: update documentation links (#3459)

# What does this PR do?
* Updates documentation links from readthedocs to llamastack.github.io

## Test Plan
* Manual testing
This commit is contained in:
Alexey Rybak 2025-09-17 10:37:35 -07:00 committed by GitHub
parent 9acf49753e
commit 9fe8097ca4
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
21 changed files with 997 additions and 993 deletions

View file

@ -1,6 +1,6 @@
# Llama Stack Documentation
Here's a collection of comprehensive guides, examples, and resources for building AI applications with Llama Stack. For the complete documentation, visit our [ReadTheDocs page](https://llama-stack.readthedocs.io/en/latest/index.html).
Here's a collection of comprehensive guides, examples, and resources for building AI applications with Llama Stack. For the complete documentation, visit our [Github page](https://llamastack.github.io/latest/getting_started/index.html).
## Render locally

View file

@ -11,11 +11,11 @@
"\n",
"# Llama Stack - Building AI Applications\n",
"\n",
"<img src=\"https://llama-stack.readthedocs.io/en/latest/_images/llama-stack.png\" alt=\"drawing\" width=\"500\"/>\n",
"<img src=\"https://llamastack.github.io/latest/_images/llama-stack.png\" alt=\"drawing\" width=\"500\"/>\n",
"\n",
"[Llama Stack](https://github.com/meta-llama/llama-stack) defines and standardizes the set of core building blocks needed to bring generative AI applications to market. These building blocks are presented in the form of interoperable APIs with a broad set of Service Providers providing their implementations.\n",
"\n",
"Read more about the project here: https://llama-stack.readthedocs.io/en/latest/index.html\n",
"Read more about the project here: https://llamastack.github.io/latest/getting_started/index.html\n",
"\n",
"In this guide, we will showcase how you can build LLM-powered agentic applications using Llama Stack.\n",
"\n",
@ -75,7 +75,7 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "J2kGed0R5PSf",
"metadata": {
"colab": {
@ -113,17 +113,17 @@
}
],
"source": [
"import os \n",
"import os\n",
"import subprocess\n",
"import time\n",
"\n",
"!pip install uv \n",
"!pip install uv\n",
"\n",
"if \"UV_SYSTEM_PYTHON\" in os.environ:\n",
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
"\n",
"# this command installs all the dependencies needed for the llama stack server with the together inference provider\n",
"!uv run --with llama-stack llama stack build --distro together --image-type venv \n",
"!uv run --with llama-stack llama stack build --distro together --image-type venv\n",
"\n",
"def run_llama_stack_server_background():\n",
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
@ -134,7 +134,7 @@
" stderr=log_file,\n",
" text=True\n",
" )\n",
" \n",
"\n",
" print(f\"Starting Llama Stack server with PID: {process.pid}\")\n",
" return process\n",
"\n",
@ -142,11 +142,11 @@
" import requests\n",
" from requests.exceptions import ConnectionError\n",
" import time\n",
" \n",
"\n",
" url = \"http://0.0.0.0:8321/v1/health\"\n",
" max_retries = 30\n",
" retry_interval = 1\n",
" \n",
"\n",
" print(\"Waiting for server to start\", end=\"\")\n",
" for _ in range(max_retries):\n",
" try:\n",
@ -157,12 +157,12 @@
" except ConnectionError:\n",
" print(\".\", end=\"\", flush=True)\n",
" time.sleep(retry_interval)\n",
" \n",
"\n",
" print(\"\\nServer failed to start after\", max_retries * retry_interval, \"seconds\")\n",
" return False\n",
"\n",
"\n",
"# use this helper if needed to kill the server \n",
"# use this helper if needed to kill the server\n",
"def kill_llama_stack_server():\n",
" # Kill any existing llama stack server processes\n",
" os.system(\"ps aux | grep -v grep | grep llama_stack.core.server.server | awk '{print $2}' | xargs kill -9\")\n"
@ -242,7 +242,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": null,
"id": "E1UFuJC570Tk",
"metadata": {
"colab": {
@ -407,9 +407,9 @@
"from llama_stack_client import LlamaStackClient\n",
"\n",
"client = LlamaStackClient(\n",
" base_url=\"http://0.0.0.0:8321\", \n",
" base_url=\"http://0.0.0.0:8321\",\n",
" provider_data = {\n",
" \"tavily_search_api_key\": os.environ['TAVILY_SEARCH_API_KEY'], \n",
" \"tavily_search_api_key\": os.environ['TAVILY_SEARCH_API_KEY'],\n",
" \"together_api_key\": os.environ['TOGETHER_API_KEY']\n",
" }\n",
")"
@ -1177,7 +1177,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": null,
"id": "WS8Gu5b0APHs",
"metadata": {
"colab": {
@ -1207,7 +1207,7 @@
"from termcolor import cprint\n",
"\n",
"agent = Agent(\n",
" client, \n",
" client,\n",
" model=\"meta-llama/Llama-3.3-70B-Instruct\",\n",
" instructions=\"You are a helpful assistant. Use websearch tool to help answer questions.\",\n",
" tools=[\"builtin::websearch\"],\n",
@ -1249,7 +1249,7 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": null,
"id": "GvLWltzZCNkg",
"metadata": {
"colab": {
@ -1367,7 +1367,7 @@
" chunk_size_in_tokens=512,\n",
")\n",
"rag_agent = Agent(\n",
" client, \n",
" client,\n",
" model=model_id,\n",
" instructions=\"You are a helpful assistant\",\n",
" tools = [\n",
@ -2154,7 +2154,7 @@
},
{
"cell_type": "code",
"execution_count": 21,
"execution_count": null,
"id": "vttLbj_YO01f",
"metadata": {
"colab": {
@ -2217,7 +2217,7 @@
"from termcolor import cprint\n",
"\n",
"agent = Agent(\n",
" client, \n",
" client,\n",
" model=model_id,\n",
" instructions=\"You are a helpful assistant\",\n",
" tools=[\"mcp::filesystem\"],\n",
@ -2283,7 +2283,7 @@
},
{
"cell_type": "code",
"execution_count": 22,
"execution_count": null,
"id": "4iCO59kP20Zs",
"metadata": {
"colab": {
@ -2317,7 +2317,7 @@
"from llama_stack_client import Agent, AgentEventLogger\n",
"\n",
"agent = Agent(\n",
" client, \n",
" client,\n",
" model=\"meta-llama/Llama-3.3-70B-Instruct\",\n",
" instructions=\"You are a helpful assistant. Use web_search tool to answer the questions.\",\n",
" tools=[\"builtin::websearch\"],\n",
@ -2846,7 +2846,7 @@
},
{
"cell_type": "code",
"execution_count": 29,
"execution_count": null,
"id": "44e05e16",
"metadata": {},
"outputs": [
@ -2880,8 +2880,7 @@
"!curl -O https://raw.githubusercontent.com/meta-llama/llama-models/refs/heads/main/Llama_Repo.jpeg\n",
"\n",
"from IPython.display import Image\n",
"Image(\"Llama_Repo.jpeg\", width=256, height=256)\n",
"\n"
"Image(\"Llama_Repo.jpeg\", width=256, height=256)\n"
]
},
{

View file

@ -11,11 +11,11 @@
"\n",
"# Getting Started with Llama 4 in Llama Stack\n",
"\n",
"<img src=\"https://llama-stack.readthedocs.io/en/latest/_images/llama-stack.png\" alt=\"drawing\" width=\"500\"/>\n",
"<img src=\"https://llamastack.github.io/latest/_images/llama-stack.png\" alt=\"drawing\" width=\"500\"/>\n",
"\n",
"[Llama Stack](https://github.com/meta-llama/llama-stack) defines and standardizes the set of core building blocks needed to bring generative AI applications to market. These building blocks are presented in the form of interoperable APIs with a broad set of Service Providers providing their implementations.\n",
"\n",
"Read more about the project here: https://llama-stack.readthedocs.io/en/latest/index.html\n",
"Read more about the project here: https://llamastack.github.io/latest/index.html\n",
"\n",
"In this guide, we will showcase how you can get started with using Llama 4 in Llama Stack.\n",
"\n",
@ -51,7 +51,7 @@
"metadata": {},
"outputs": [],
"source": [
"!pip install uv \n",
"!pip install uv\n",
"\n",
"MODEL=\"Llama-4-Scout-17B-16E-Instruct\"\n",
"# get meta url from llama.com\n",
@ -223,7 +223,7 @@
}
],
"source": [
"import os \n",
"import os\n",
"import subprocess\n",
"import time\n",
"\n",
@ -232,8 +232,8 @@
"if \"UV_SYSTEM_PYTHON\" in os.environ:\n",
" del os.environ[\"UV_SYSTEM_PYTHON\"]\n",
"\n",
"# this command installs all the dependencies needed for the llama stack server \n",
"!uv run --with llama-stack llama stack build --distro meta-reference-gpu --image-type venv \n",
"# this command installs all the dependencies needed for the llama stack server\n",
"!uv run --with llama-stack llama stack build --distro meta-reference-gpu --image-type venv\n",
"\n",
"def run_llama_stack_server_background():\n",
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
@ -244,7 +244,7 @@
" stderr=log_file,\n",
" text=True\n",
" )\n",
" \n",
"\n",
" print(f\"Starting Llama Stack server with PID: {process.pid}\")\n",
" return process\n",
"\n",
@ -252,11 +252,11 @@
" import requests\n",
" from requests.exceptions import ConnectionError\n",
" import time\n",
" \n",
"\n",
" url = \"http://0.0.0.0:8321/v1/health\"\n",
" max_retries = 30\n",
" retry_interval = 1\n",
" \n",
"\n",
" print(\"Waiting for server to start\", end=\"\")\n",
" for _ in range(max_retries):\n",
" try:\n",
@ -267,12 +267,12 @@
" except ConnectionError:\n",
" print(\".\", end=\"\", flush=True)\n",
" time.sleep(retry_interval)\n",
" \n",
"\n",
" print(\"\\nServer failed to start after\", max_retries * retry_interval, \"seconds\")\n",
" return False\n",
"\n",
"\n",
"# use this helper if needed to kill the server \n",
"# use this helper if needed to kill the server\n",
"def kill_llama_stack_server():\n",
" # Kill any existing llama stack server processes\n",
" os.system(\"ps aux | grep -v grep | grep llama_stack.core.server.server | awk '{print $2}' | xargs kill -9\")\n"

File diff suppressed because one or more lines are too long

View file

@ -14,7 +14,7 @@
"We will also showcase how to leverage existing Llama stack [inference APIs](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/apis/inference/inference.py) (ollama as provider) to get the new model's output and the [eval APIs](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/apis/eval/eval.py) to help you better measure the new model performance. We hope the flywheel of post-training -> eval -> inference can greatly empower agentic apps development.\n",
"\n",
"\n",
"- Read more about Llama Stack: https://llama-stack.readthedocs.io/en/latest/introduction/index.html\n",
"- Read more about Llama Stack: https://llamastack.github.io/latest/index.html\n",
"- Read more about post training APIs definition: https://github.com/meta-llama/llama-stack/blob/main/llama_stack/apis/post_training/post_training.py\n",
"\n",
"\n",
@ -3632,7 +3632,7 @@
},
"source": [
"#### 1.2. Kick-off eval job\n",
"- More details on Llama-stack eval: https://llama-stack.readthedocs.io/en/latest/benchmark_evaluations/index.html\n",
"- More details on Llama-stack eval: https://llamastack.github.io/latest/references/evals_reference/index.html\n",
" - Define an EvalCandidate\n",
" - Run evaluate on datasets (we choose brainstrust's answer-similarity as scoring function with OpenAI's model as judge model)\n",
"\n",

View file

@ -12,7 +12,7 @@
"\n",
"This notebook will walk you through the main sets of APIs we offer with Llama Stack for supporting running benchmark evaluations of your with working examples to explore the possibilities that Llama Stack opens up for you.\n",
"\n",
"Read more about Llama Stack: https://llama-stack.readthedocs.io/en/latest/index.html"
"Read more about Llama Stack: https://llamastack.github.io/latest/index.html"
]
},
{

View file

@ -11,7 +11,7 @@
"\n",
"# Llama Stack - Building AI Applications\n",
"\n",
"<img src=\"https://llama-stack.readthedocs.io/en/latest/_images/llama-stack.png\" alt=\"drawing\" width=\"500\"/>\n",
"<img src=\"https://llamastack.github.io/latest/_images/llama-stack.png\" alt=\"drawing\" width=\"500\"/>\n",
"\n",
"Get started with Llama Stack in minutes!\n",
"\n",
@ -138,7 +138,7 @@
},
"outputs": [],
"source": [
"import os \n",
"import os\n",
"import subprocess\n",
"\n",
"if \"UV_SYSTEM_PYTHON\" in os.environ:\n",
@ -150,13 +150,13 @@
"def run_llama_stack_server_background():\n",
" log_file = open(\"llama_stack_server.log\", \"w\")\n",
" process = subprocess.Popen(\n",
" f\"OLLAMA_URL=http://localhost:11434 uv run --with llama-stack llama stack run starter --image-type venv",
" f\"OLLAMA_URL=http://localhost:11434 uv run --with llama-stack llama stack run starter --image-type venv\n",
" shell=True,\n",
" stdout=log_file,\n",
" stderr=log_file,\n",
" text=True\n",
" )\n",
" \n",
"\n",
" print(f\"Starting Llama Stack server with PID: {process.pid}\")\n",
" return process\n",
"\n",
@ -164,11 +164,11 @@
" import requests\n",
" from requests.exceptions import ConnectionError\n",
" import time\n",
" \n",
"\n",
" url = \"http://0.0.0.0:8321/v1/health\"\n",
" max_retries = 30\n",
" retry_interval = 1\n",
" \n",
"\n",
" print(\"Waiting for server to start\", end=\"\")\n",
" for _ in range(max_retries):\n",
" try:\n",
@ -179,12 +179,12 @@
" except ConnectionError:\n",
" print(\".\", end=\"\", flush=True)\n",
" time.sleep(retry_interval)\n",
" \n",
"\n",
" print(\"\\nServer failed to start after\", max_retries * retry_interval, \"seconds\")\n",
" return False\n",
"\n",
"\n",
"# use this helper if needed to kill the server \n",
"# use this helper if needed to kill the server\n",
"def kill_llama_stack_server():\n",
" # Kill any existing llama stack server processes\n",
" os.system(\"ps aux | grep -v grep | grep llama_stack.core.server.server | awk '{print $2}' | xargs kill -9\")\n"

View file

@ -9,7 +9,7 @@
"\n",
"This document provides instructions on how to use Llama Stack's `chat_completion` function for generating text using the `Llama3.2-3B-Instruct` model. \n",
"\n",
"Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html).\n",
"Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llamastack.github.io/latest/getting_started/index.html).\n",
"\n",
"\n",
"### Table of Contents\n",

View file

@ -10,7 +10,7 @@
"This guide provides a streamlined setup to switch between local and cloud clients for text generation with Llama Stacks `chat_completion` API. This setup enables automatic fallback to a cloud instance if the local client is unavailable.\n",
"\n",
"### Prerequisites\n",
"Before you begin, please ensure Llama Stack is installed and the distribution is set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/). You will need to run two distributions, a local and a cloud distribution, for this demo to work.\n",
"Before you begin, please ensure Llama Stack is installed and the distribution is set up by following the [Getting Started Guide](https://llamastack.github.io/latest/getting_started/index.html). You will need to run two distributions, a local and a cloud distribution, for this demo to work.\n",
"\n",
"### Implementation"
]

View file

@ -11,7 +11,7 @@
"\n",
"This interactive guide covers prompt engineering & best practices with Llama 3.2 and Llama Stack.\n",
"\n",
"Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html)."
"Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llamastack.github.io/latest/getting_started/index.html)."
]
},
{

View file

@ -7,7 +7,7 @@
"source": [
"## Getting Started with LlamaStack Vision API\n",
"\n",
"Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html).\n",
"Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llamastack.github.io/latest/getting_started/index.html).\n",
"\n",
"Let's import the necessary packages"
]

View file

@ -26,7 +26,7 @@
"A running instance of the Llama Stack server (we'll use localhost in \n",
"this tutorial)\n",
"\n",
"Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html).\n",
"Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llamastack.github.io/latest/getting_started/index.html).\n",
"\n",
"Let's start by installing the required packages:"
]
@ -268,7 +268,7 @@
" # Split document content into chunks of 512 characters\n",
" content = doc.content\n",
" chunk_size = 512\n",
" \n",
"\n",
" # Create chunks of the specified size\n",
" for i in range(0, len(content), chunk_size):\n",
" chunk_content = content[i:i+chunk_size]\n",

View file

@ -6,7 +6,7 @@
"source": [
"## Safety API 101\n",
"\n",
"This document talks about the Safety APIs in Llama Stack. Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html).\n",
"This document talks about the Safety APIs in Llama Stack. Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llamastack.github.io/latest/getting_started/index.html).\n",
"\n",
"As outlined in our [Responsible Use Guide](https://www.llama.com/docs/how-to-guides/responsible-use-guide-resources/), LLM apps should deploy appropriate system level safeguards to mitigate safety and security risks of LLM system, similar to the following diagram:\n",
"\n",

View file

@ -6,7 +6,7 @@
"source": [
"## Agentic API 101\n",
"\n",
"This document talks about the Agentic APIs in Llama Stack. Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/index.html).\n",
"This document talks about the Agentic APIs in Llama Stack. Before you begin, please ensure Llama Stack is installed and set up by following the [Getting Started Guide](https://llamastack.github.io/latest/getting_started/index.html).\n",
"\n",
"Starting Llama 3.1 you can build agentic applications capable of:\n",
"\n",

View file

@ -9,13 +9,18 @@ If you're looking for more specific topics, we have a [Zero to Hero Guide](#next
> If you'd prefer not to set up a local server, explore our notebook on [tool calling with the Together API](Tool_Calling101_Using_Together_Llama_Stack_Server.ipynb). This notebook will show you how to leverage together.ai's Llama Stack Server API, allowing you to get started with Llama Stack without the need for a locally built and running server.
## Table of Contents
1. [Setup and run ollama](#setup-ollama)
2. [Install Dependencies and Set Up Environment](#install-dependencies-and-set-up-environment)
3. [Build, Configure, and Run Llama Stack](#build-configure-and-run-llama-stack)
4. [Test with llama-stack-client CLI](#test-with-llama-stack-client-cli)
5. [Test with curl](#test-with-curl)
6. [Test with Python](#test-with-python)
7. [Next Steps](#next-steps)
- [Llama Stack: from Zero to Hero](#llama-stack-from-zero-to-hero)
- [Table of Contents](#table-of-contents)
- [Setup ollama](#setup-ollama)
- [Install Dependencies and Set Up Environment](#install-dependencies-and-set-up-environment)
- [Build, Configure, and Run Llama Stack](#build-configure-and-run-llama-stack)
- [Test with `llama-stack-client` CLI](#test-with-llama-stack-client-cli)
- [Test with `curl`](#test-with-curl)
- [Test with Python](#test-with-python)
- [1. Create Python Script (`test_llama_stack.py`)](#1-create-python-script-test_llama_stackpy)
- [2. Create a Chat Completion Request in Python](#2-create-a-chat-completion-request-in-python)
- [3. Run the Python Script](#3-run-the-python-script)
- [Next Steps](#next-steps)
---
@ -242,7 +247,7 @@ This command initializes the model to interact with your local Llama Stack insta
## Next Steps
**Explore Other Guides**: Dive deeper into specific topics by following these guides:
- [Understanding Distribution](https://llama-stack.readthedocs.io/en/latest/concepts/index.html#distributions)
- [Understanding Distribution](https://llamastack.github.io/latest/concepts/index.html#distributions)
- [Inference 101](00_Inference101.ipynb)
- [Local and Cloud Model Toggling 101](01_Local_Cloud_Inference101.ipynb)
- [Prompt Engineering](02_Prompt_Engineering101.ipynb)
@ -259,7 +264,7 @@ This command initializes the model to interact with your local Llama Stack insta
- [Swift SDK](https://github.com/meta-llama/llama-stack-client-swift)
- [Kotlin SDK](https://github.com/meta-llama/llama-stack-client-kotlin)
**Advanced Configuration**: Learn how to customize your Llama Stack distribution by referring to the [Building a Llama Stack Distribution](https://llama-stack.readthedocs.io/en/latest/distributions/building_distro.html) guide.
**Advanced Configuration**: Learn how to customize your Llama Stack distribution by referring to the [Building a Llama Stack Distribution](https://llamastack.github.io/latest/distributions/building_distro.html) guide.
**Explore Example Apps**: Check out [llama-stack-apps](https://github.com/meta-llama/llama-stack-apps/tree/main/examples) for example applications built using Llama Stack.