Allow markdown and python files to be referenced direclty and add it to

the docs Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
2025-12-12 04:00:42 +00:00 · 2025-10-21 10:14:15 -04:00 · 2025-10-21 10:14:15 -04:00 · 9ccd2e4e9b
commit 9ccd2e4e9b
parent 7ab63068f8
11 changed files with 493 additions and 1145 deletions
--- a/.gitignore
+++ b/.gitignore
@ -31,3 +31,4 @@ CLAUDE.md
 .claude/
 docs/.docusaurus/
 docs/node_modules/
 docs/static/imported-files/
--- a/docs/README.md
+++ b/docs/README.md
@ -15,16 +15,39 @@ You can open up the docs in your browser at http://localhost:3000
 ## File Import System
-This documentation uses a custom component to import files directly from the repository, eliminating copy-paste maintenance:
+This documentation uses `remark-code-import` to import files directly from the repository, eliminating copy-paste maintenance. Files are automatically embedded during build time.
-```jsx
+### Importing Code Files
 import CodeFromFile from '@site/src/components/CodeFromFile';
-<CodeFromFile src="path/to/file.py" />
+To import Python code (or any code files) with syntax highlighting, use this syntax in `.mdx` files:
-<CodeFromFile src="README.md" startLine={1} endLine={20} />
+
 ```markdown
 ```python file=./demo_script.py title="demo_script.py"
 ```
 ```
-Files are automatically synced from the repo root when building. See the `CodeFromFile` component for syntax highlighting, line ranges, and multi-language support.
+This automatically imports the file content and displays it as a formatted code block with Python syntax highlighting.
 **Note:** Paths are relative to the current `.mdx` file location, not the repository root.
 ### Importing Markdown Files as Content
 For importing and rendering markdown files (like CONTRIBUTING.md), use the raw-loader approach:
 ```jsx
 import Contributing from '!!raw-loader!../../../CONTRIBUTING.md';
 import ReactMarkdown from 'react-markdown';
 <ReactMarkdown>{Contributing}</ReactMarkdown>
 ```
 **Requirements:**
 - Install dependencies: `npm install --save-dev raw-loader react-markdown`
 **Path Resolution:**
 - For `remark-code-import`: Paths are relative to the current `.mdx` file location
 - For `raw-loader`: Paths are relative to the current `.mdx` file location
 - Use `../` to navigate up directories as needed
 ## Content
--- a/docs/docs/contributing/index.mdx
+++ b/docs/docs/contributing/index.mdx
@ -1,232 +1,13 @@
-# Contributing to Llama Stack
+---
-We want to make contributing to this project as easy and transparent as
+title: Contributing
-possible.
+description: Contributing to Llama Stack
 sidebar_label: Contributing to Llama Stack
 sidebar_position: 3
 hide_title: true
 ---
-## Set up your development environment
+import Contributing from '!!raw-loader!../../../CONTRIBUTING.md';
 import ReactMarkdown from 'react-markdown';
 We use [uv](https://github.com/astral-sh/uv) to manage python dependencies and virtual environments.
 You can install `uv` by following this [guide](https://docs.astral.sh/uv/getting-started/installation/).
-You can install the dependencies by running:
+<ReactMarkdown>{Contributing}</ReactMarkdown>
 ```bash
 cd llama-stack
 uv sync --group dev
 uv pip install -e .
 source .venv/bin/activate
 ```
 ```{note}
 You can use a specific version of Python with `uv` by adding the `--python <version>` flag (e.g. `--python 3.12`).
 Otherwise, `uv` will automatically select a Python version according to the `requires-python` section of the `pyproject.toml`.
 For more info, see the [uv docs around Python versions](https://docs.astral.sh/uv/concepts/python-versions/).
 ```
 Note that you can create a dotenv file `.env` that includes necessary environment variables:
 ```
 LLAMA_STACK_BASE_URL=http://localhost:8321
 LLAMA_STACK_CLIENT_LOG=debug
 LLAMA_STACK_PORT=8321
 LLAMA_STACK_CONFIG=<provider-name>
 TAVILY_SEARCH_API_KEY=
 BRAVE_SEARCH_API_KEY=
 ```
 And then use this dotenv file when running client SDK tests via the following:
 ```bash
 uv run --env-file .env -- pytest -v tests/integration/inference/test_text_inference.py --text-model=meta-llama/Llama-3.1-8B-Instruct
 ```
 ### Pre-commit Hooks
 We use [pre-commit](https://pre-commit.com/) to run linting and formatting checks on your code. You can install the pre-commit hooks by running:
 ```bash
 uv run pre-commit install
 ```
 After that, pre-commit hooks will run automatically before each commit.
 Alternatively, if you don't want to install the pre-commit hooks, you can run the checks manually by running:
 ```bash
 uv run pre-commit run --all-files
 ```
 ```{caution}
 Before pushing your changes, make sure that the pre-commit hooks have passed successfully.
 ```
 ## Discussions -> Issues -> Pull Requests
 We actively welcome your pull requests. However, please read the following. This is heavily inspired by [Ghostty](https://github.com/ghostty-org/ghostty/blob/main/CONTRIBUTING.md).
 If in doubt, please open a [discussion](https://github.com/meta-llama/llama-stack/discussions); we can always convert that to an issue later.
 ### Issues
 We use GitHub issues to track public bugs. Please ensure your description is
 clear and has sufficient instructions to be able to reproduce the issue.
 Meta has a [bounty program](http://facebook.com/whitehat/info) for the safe
 disclosure of security bugs. In those cases, please go through the process
 outlined on that page and do not file a public issue.
 ### Contributor License Agreement ("CLA")
 In order to accept your pull request, we need you to submit a CLA. You only need
 to do this once to work on any of Meta's open source projects.
 Complete your CLA here: [https://code.facebook.com/cla](https://code.facebook.com/cla)
 **I'd like to contribute!**
 If you are new to the project, start by looking at the issues tagged with "good first issue". If you're interested
 leave a comment on the issue and a triager will assign it to you.
 Please avoid picking up too many issues at once. This helps you stay focused and ensures that others in the community also have opportunities to contribute.
 - Try to work on only 1–2 issues at a time, especially if you’re still getting familiar with the codebase.
 - Before taking an issue, check if it’s already assigned or being actively discussed.
 - If you’re blocked or can’t continue with an issue, feel free to unassign yourself or leave a comment so others can step in.
 **I have a bug!**
 1. Search the issue tracker and discussions for similar issues.
 2. If you don't have steps to reproduce, open a discussion.
 3. If you have steps to reproduce, open an issue.
 **I have an idea for a feature!**
 1. Open a discussion.
 **I've implemented a feature!**
 1. If there is an issue for the feature, open a pull request.
 2. If there is no issue, open a discussion and link to your branch.
 **I have a question!**
 1. Open a discussion or use [Discord](https://discord.gg/llama-stack).
 **Opening a Pull Request**
 1. Fork the repo and create your branch from `main`.
 2. If you've changed APIs, update the documentation.
 3. Ensure the test suite passes.
 4. Make sure your code lints using `pre-commit`.
 5. If you haven't already, complete the Contributor License Agreement ("CLA").
 6. Ensure your pull request follows the [conventional commits format](https://www.conventionalcommits.org/en/v1.0.0/).
 7. Ensure your pull request follows the [coding style](#coding-style).
 Please keep pull requests (PRs) small and focused. If you have a large set of changes, consider splitting them into logically grouped, smaller PRs to facilitate review and testing.
 ```{tip}
 As a general guideline:
 - Experienced contributors should try to keep no more than 5 open PRs at a time.
 - New contributors are encouraged to have only one open PR at a time until they’re familiar with the codebase and process.
 ```
 ## Repository guidelines
 ### Coding Style
 * Comments should provide meaningful insights into the code. Avoid filler comments that simply
  describe the next step, as they create unnecessary clutter, same goes for docstrings.
 * Prefer comments to clarify surprising behavior and/or relationships between parts of the code
  rather than explain what the next line of code does.
 * Catching exceptions, prefer using a specific exception type rather than a broad catch-all like
  `Exception`.
 * Error messages should be prefixed with "Failed to ..."
 * 4 spaces for indentation rather than tab
 * When using `# noqa` to suppress a style or linter warning, include a comment explaining the
  justification for bypassing the check.
 * When using `# type: ignore` to suppress a mypy warning, include a comment explaining the
  justification for bypassing the check.
 * Don't use unicode characters in the codebase. ASCII-only is preferred for compatibility or
  readability reasons.
 * Providers configuration class should be Pydantic Field class. It should have a `description` field
  that describes the configuration. These descriptions will be used to generate the provider
  documentation.
 * When possible, use keyword arguments only when calling functions.
 * Llama Stack utilizes custom Exception classes for certain Resources that should be used where applicable.
 ### License
 By contributing to Llama, you agree that your contributions will be licensed
 under the LICENSE file in the root directory of this source tree.
 ## Common Tasks
 Some tips about common tasks you work on while contributing to Llama Stack:
 ### Setup for development
 ```bash
 git clone https://github.com/meta-llama/llama-stack.git
 cd llama-stack
 uv run llama stack list-deps <distro-name> | xargs -L1 uv pip install
 # (Optional) If you are developing the llama-stack-client-python package, you can add it as an editable package.
 git clone https://github.com/meta-llama/llama-stack-client-python.git
 uv add --editable ../llama-stack-client-python
 ```
 ### Updating distribution configurations
 If you have made changes to a provider's configuration in any form (introducing a new config key, or
 changing models, etc.), you should run `./scripts/distro_codegen.py` to re-generate various YAML
 files as well as the documentation. You should not change `docs/source/.../distributions/` files
 manually as they are auto-generated.
 ### Updating the provider documentation
 If you have made changes to a provider's configuration, you should run `./scripts/provider_codegen.py`
 to re-generate the documentation. You should not change `docs/source/.../providers/` files manually
 as they are auto-generated.
 Note that the provider "description" field will be used to generate the provider documentation.
 ### Building the Documentation
 If you are making changes to the documentation at [https://llamastack.github.io/](https://llamastack.github.io/), you can use the following command to build the documentation and preview your changes.
 ```bash
 # This rebuilds the documentation pages and the OpenAPI spec.
 npm install
 npm run gen-api-docs all
 npm run build
 # This will start a local server (usually at http://127.0.0.1:3000).
 npm run serve
 ```
 ### Update API Documentation
 If you modify or add new API endpoints, update the API documentation accordingly. You can do this by running the following command:
 ```bash
 uv run ./docs/openapi_generator/run_openapi_generator.sh
 ```
 The generated API schema will be available in `docs/static/`. Make sure to review the changes before committing.
 ## Adding a New Provider
 See:
 - [Adding a New API Provider Page](./new_api_provider.mdx) which describes how to add new API providers to the Stack.
 - [Vector Database Page](./new_vector_database.mdx) which describes how to add a new vector databases with Llama Stack.
 - [External Provider Page](/docs/providers/external/) which describes how to add external providers to the Stack.
 ## Testing
 See the [Testing README](https://github.com/meta-llama/llama-stack/blob/main/tests/README.md) for detailed testing information.
 ## Advanced Topics
 For developers who need deeper understanding of the testing system internals:
 - [Record-Replay Testing](./testing/record-replay.mdx)
 ### Benchmarking
 See the [Benchmarking README](https://github.com/meta-llama/llama-stack/blob/main/benchmarking/k8s-benchmark/README.md) for benchmarking information.
--- a/docs/docs/getting_started/quickstart.mdx
+++ b/docs/docs/getting_started/quickstart.mdx
@ -24,6 +24,9 @@ ollama run llama3.2:3b --keepalive 60m
 #### Step 2: Run the Llama Stack server
 ```python file=./demo_script.py title="demo_script.py"
 ```
 We will use `uv` to install dependencies and run the Llama Stack server.
 ```bash
 # Install dependencies for the starter distribution
@ -35,9 +38,6 @@ OLLAMA_URL=http://localhost:11434 uv run --with llama-stack llama stack run star
 #### Step 3: Run the demo
 Now open up a new terminal and copy the following script into a file named `demo_script.py`.
 import CodeFromFile from '@site/src/components/CodeFromFile';
 <CodeFromFile src="demo_script.py" title="demo_script.py" />
 We will use `uv` to run the script
 ```
 uv run --with llama-stack-client,fire,requests demo_script.py
--- a/docs/docusaurus.config.ts
+++ b/docs/docusaurus.config.ts
@ -71,6 +71,11 @@ const config: Config = {
        docs: {
          sidebarPath: require.resolve("./sidebars.ts"),
          docItemComponent: "@theme/ApiItem", // Derived from docusaurus-theme-openapi
          remarkPlugins: [
            [require('remark-code-import'), {
              rootDir: require('path').join(__dirname, '..') // Repository root
            }]
          ],
        },
        blog: false,
        theme: {
--- a/docs/package-lock.json
+++ b/docs/package-lock.json
--- a/docs/package.json
+++ b/docs/package.json
@ -4,8 +4,8 @@
  "private": true,
  "scripts": {
    "docusaurus": "docusaurus",
-    "start": "npm run sync-files && docusaurus start",
+    "start": "docusaurus start",
-    "build": "npm run sync-files && docusaurus build",
+    "build": "docusaurus build",
    "swizzle": "docusaurus swizzle",
    "deploy": "docusaurus deploy",
    "clear": "docusaurus clear",
@ -28,7 +28,8 @@
    "docusaurus-theme-openapi-docs": "4.3.7",
    "prism-react-renderer": "^2.3.0",
    "react": "^19.0.0",
-    "react-dom": "^19.0.0"
+    "react-dom": "^19.0.0",
    "remark-code-import": "^1.2.0"
  },
  "browserslist": {
    "production": [
@ -41,5 +42,9 @@
      "last 1 firefox version",
      "last 1 safari version"
    ]
  },
  "devDependencies": {
    "raw-loader": "^4.0.2",
    "react-markdown": "^10.1.0"
  }
 }
--- a/docs/scripts/sync-files.js
+++ b/docs/scripts/sync-files.js
@ -47,6 +47,57 @@ function trackFileUsage(filePath) {
  }
 }
 // Filter content based on file type and options
 function filterContent(content, filePath) {
  let lines = content.split('\n');
  // Skip copyright header for Python files
  if (filePath.endsWith('.py')) {
    // Read the license header file
    const licenseHeaderPath = path.join(repoRoot, 'docs', 'license_header.txt');
    if (fs.existsSync(licenseHeaderPath)) {
      try {
        const licenseText = fs.readFileSync(licenseHeaderPath, 'utf8');
        const licenseLines = licenseText.trim().split('\n');
        // Check if file starts with the license header (accounting for # comments)
        if (lines.length >= licenseLines.length) {
          let matches = true;
          for (let i = 0; i < licenseLines.length; i++) {
            const codeLine = lines[i]?.replace(/^#\s*/, '').trim();
            const licenseLine = licenseLines[i]?.trim();
            if (codeLine !== licenseLine) {
              matches = false;
              break;
            }
          }
          if (matches) {
            // Skip the license header and any trailing empty lines
            let skipTo = licenseLines.length;
            while (skipTo < lines.length && lines[skipTo].trim() === '') {
              skipTo++;
            }
            lines = lines.slice(skipTo);
          }
        }
      } catch (error) {
        console.warn(`Could not read license header, skipping filtering for ${filePath}`);
      }
    }
  }
  // Trim empty lines from start and end
  while (lines.length > 0 && lines[0].trim() === '') {
    lines.shift();
  }
  while (lines.length > 0 && lines[lines.length - 1].trim() === '') {
    lines.pop();
  }
  return lines.join('\n');
 }
 // Sync a file from repo root to static directory
 function syncFile(filePath) {
  const sourcePath = path.join(repoRoot, filePath);
@ -61,7 +112,8 @@ function syncFile(filePath) {
  try {
    if (fs.existsSync(sourcePath)) {
      const content = fs.readFileSync(sourcePath, 'utf8');
-      fs.writeFileSync(destPath, content);
+      const filteredContent = filterContent(content, filePath);
      fs.writeFileSync(destPath, filteredContent);
      console.log(`✅ Synced ${filePath}`);
      trackFileUsage(filePath);
      return true;
--- a/docs/src/components/CodeFromFile.jsx
+++ b/docs/src/components/CodeFromFile.jsx
@ -15,33 +15,7 @@ export default function CodeFromFile({
  useEffect(() => {
    async function loadFile() {
      try {
-        // Register this file for syncing (build-time only)
+        // File registration is now handled by the file-sync-plugin during build
        if (typeof window === 'undefined') {
          // This runs during build - register the file
          const fs = require('fs');
          const path = require('path');
          const usageFile = path.join(process.cwd(), 'static', 'imported-files', 'usage.json');
          const usageDir = path.dirname(usageFile);
          if (!fs.existsSync(usageDir)) {
            fs.mkdirSync(usageDir, { recursive: true });
          }
          let usage = { files: [] };
          if (fs.existsSync(usageFile)) {
            try {
              usage = JSON.parse(fs.readFileSync(usageFile, 'utf8'));
            } catch (error) {
              console.warn('Could not read existing usage file');
            }
          }
          if (!usage.files.includes(src)) {
            usage.files.push(src);
            fs.writeFileSync(usageFile, JSON.stringify(usage, null, 2));
          }
        }
        // Load file from static/imported-files directory
        const response = await fetch(`/imported-files/${src}`);
@ -50,7 +24,7 @@ export default function CodeFromFile({
        }
        let text = await response.text();
-        // Handle line range if specified
+        // Handle line range if specified (filtering is done at build time)
        if (startLine || endLine) {
          const lines = text.split('\n');
          const start = startLine ? Math.max(0, startLine - 1) : 0;
--- a/docs/static/imported-files/README.md
+++ b/docs/static/imported-files/README.md
@ -1,207 +0,0 @@
 # Llama Stack
 [![PyPI version](https://img.shields.io/pypi/v/llama_stack.svg)](https://pypi.org/project/llama_stack/)
 [![PyPI - Downloads](https://img.shields.io/pypi/dm/llama-stack)](https://pypi.org/project/llama-stack/)
 [![License](https://img.shields.io/pypi/l/llama_stack.svg)](https://github.com/meta-llama/llama-stack/blob/main/LICENSE)
 [![Discord](https://img.shields.io/discord/1257833999603335178?color=6A7EC2&logo=discord&logoColor=ffffff)](https://discord.gg/llama-stack)
 [![Unit Tests](https://github.com/meta-llama/llama-stack/actions/workflows/unit-tests.yml/badge.svg?branch=main)](https://github.com/meta-llama/llama-stack/actions/workflows/unit-tests.yml?query=branch%3Amain)
 [![Integration Tests](https://github.com/meta-llama/llama-stack/actions/workflows/integration-tests.yml/badge.svg?branch=main)](https://github.com/meta-llama/llama-stack/actions/workflows/integration-tests.yml?query=branch%3Amain)
 [**Quick Start**](https://llamastack.github.io/docs/getting_started/quickstart) | [**Documentation**](https://llamastack.github.io/docs) | [**Colab Notebook**](./docs/getting_started.ipynb) | [**Discord**](https://discord.gg/llama-stack)
 ### ✨🎉 Llama 4 Support  🎉✨
 We released [Version 0.2.0](https://github.com/meta-llama/llama-stack/releases/tag/v0.2.0) with support for the Llama 4 herd of models released by Meta.
 <details>
 <summary>👋 Click here to see how to run Llama 4 models on Llama Stack </summary>
 \
 *Note you need 8xH100 GPU-host to run these models*
 ```bash
 pip install -U llama_stack
 MODEL="Llama-4-Scout-17B-16E-Instruct"
 # get meta url from llama.com
 huggingface-cli download meta-llama/$MODEL --local-dir ~/.llama/$MODEL
 # start a llama stack server
 INFERENCE_MODEL=meta-llama/$MODEL llama stack build --run --template meta-reference-gpu
 # install client to interact with the server
 pip install llama-stack-client
 ```
 ### CLI
 ```bash
 # Run a chat completion
 MODEL="Llama-4-Scout-17B-16E-Instruct"
 llama-stack-client --endpoint http://localhost:8321 \
 inference chat-completion \
 --model-id meta-llama/$MODEL \
 --message "write a haiku for meta's llama 4 models"
 OpenAIChatCompletion(
    ...
    choices=[
        OpenAIChatCompletionChoice(
            finish_reason='stop',
            index=0,
            message=OpenAIChatCompletionChoiceMessageOpenAIAssistantMessageParam(
                role='assistant',
                content='...**Silent minds awaken,**  \n**Whispers of billions of words,**  \n**Reasoning breaks the night.**  \n\n—  \n*This haiku blends the essence of LLaMA 4\'s capabilities with nature-inspired metaphor, evoking its vast training data and transformative potential.*',
                ...
            ),
            ...
        )
    ],
    ...
 )
 ```
 ### Python SDK
 ```python
 from llama_stack_client import LlamaStackClient
 client = LlamaStackClient(base_url=f"http://localhost:8321")
 model_id = "meta-llama/Llama-4-Scout-17B-16E-Instruct"
 prompt = "Write a haiku about coding"
 print(f"User> {prompt}")
 response = client.chat.completions.create(
    model=model_id,
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": prompt},
    ],
 )
 print(f"Assistant> {response.choices[0].message.content}")
 ```
 As more providers start supporting Llama 4, you can use them in Llama Stack as well. We are adding to the list. Stay tuned!
 </details>
 ### 🚀 One-Line Installer 🚀
 To try Llama Stack locally, run:
 ```bash
 curl -LsSf https://github.com/meta-llama/llama-stack/raw/main/scripts/install.sh | bash
 ```
 ### Overview
 Llama Stack standardizes the core building blocks that simplify AI application development. It codifies best practices across the Llama ecosystem. More specifically, it provides
 - **Unified API layer** for Inference, RAG, Agents, Tools, Safety, Evals, and Telemetry.
 - **Plugin architecture** to support the rich ecosystem of different API implementations in various environments, including local development, on-premises, cloud, and mobile.
 - **Prepackaged verified distributions** which offer a one-stop solution for developers to get started quickly and reliably in any environment.
 - **Multiple developer interfaces** like CLI and SDKs for Python, Typescript, iOS, and Android.
 - **Standalone applications** as examples for how to build production-grade AI applications with Llama Stack.
 <div style="text-align: center;">
  <img
    src="https://github.com/user-attachments/assets/33d9576d-95ea-468d-95e2-8fa233205a50"
    width="480"
    title="Llama Stack"
    alt="Llama Stack"
  />
 </div>
 ### Llama Stack Benefits
 - **Flexible Options**: Developers can choose their preferred infrastructure without changing APIs and enjoy flexible deployment choices.
 - **Consistent Experience**: With its unified APIs, Llama Stack makes it easier to build, test, and deploy AI applications with consistent application behavior.
 - **Robust Ecosystem**: Llama Stack is already integrated with distribution partners (cloud providers, hardware vendors, and AI-focused companies) that offer tailored infrastructure, software, and services for deploying Llama models.
 By reducing friction and complexity, Llama Stack empowers developers to focus on what they do best: building transformative generative AI applications.
 ### API Providers
 Here is a list of the various API providers and available distributions that can help developers get started easily with Llama Stack.
 Please checkout for [full list](https://llamastack.github.io/docs/providers)
 | API Provider Builder | Environments | Agents | Inference | VectorIO | Safety | Telemetry | Post Training | Eval | DatasetIO |
 |:--------------------:|:------------:|:------:|:---------:|:--------:|:------:|:---------:|:-------------:|:----:|:--------:|
 |    Meta Reference    | Single Node | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
 |      SambaNova       | Hosted | | ✅ | | ✅ | | | | |
 |       Cerebras       | Hosted | | ✅ | | | | | | |
 |      Fireworks       | Hosted | ✅ | ✅ | ✅ | | | | | |
 |     AWS Bedrock      | Hosted | | ✅ | | ✅ | | | | |
 |       Together       | Hosted | ✅ | ✅ | | ✅ | | | | |
 |         Groq         | Hosted | | ✅ | | | | | | |
 |        Ollama        | Single Node | | ✅ | | | | | | |
 |         TGI          | Hosted/Single Node | | ✅ | | | | | | |
 |      NVIDIA NIM      | Hosted/Single Node | | ✅ | | ✅ | | | | |
 |       ChromaDB       | Hosted/Single Node | | | ✅ | | | | | |
 |        Milvus        | Hosted/Single Node | | | ✅ | | | | | |
 |        Qdrant        | Hosted/Single Node | | | ✅ | | | | | |
 |       Weaviate       | Hosted/Single Node | | | ✅ | | | | | |
 |      SQLite-vec      | Single Node | | | ✅ | | | | | |
 |      PG Vector       | Single Node | | | ✅ | | | | | |
 |  PyTorch ExecuTorch  | On-device iOS | ✅ | ✅ | | | | | | |
 |         vLLM         | Single Node | | ✅ | | | | | | |
 |        OpenAI        | Hosted | | ✅ | | | | | | |
 |      Anthropic       | Hosted | | ✅ | | | | | | |
 |        Gemini        | Hosted | | ✅ | | | | | | |
 |       WatsonX        | Hosted | | ✅ | | | | | | |
 |     HuggingFace      | Single Node | | | | | | ✅ | | ✅ |
 |      TorchTune       | Single Node | | | | | | ✅ | | |
 |     NVIDIA NEMO      | Hosted | | ✅ | ✅ | | | ✅ | ✅ | ✅ |
 |        NVIDIA        | Hosted | | | | | | ✅ | ✅ | ✅ |
 > **Note**: Additional providers are available through external packages. See [External Providers](https://llamastack.github.io/docs/providers/external) documentation.
 ### Distributions
 A Llama Stack Distribution (or "distro") is a pre-configured bundle of provider implementations for each API component. Distributions make it easy to get started with a specific deployment scenario - you can begin with a local development setup (eg. ollama) and seamlessly transition to production (eg. Fireworks) without changing your application code.
 Here are some of the distributions we support:
 |               **Distribution**                |                                                                    **Llama Stack Docker**                                                                     |                                                 Start This Distribution                                                  |
 |:---------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------:|
 |                Starter Distribution                 |           [llamastack/distribution-starter](https://hub.docker.com/repository/docker/llamastack/distribution-starter/general)           |      [Guide](https://llamastack.github.io/latest/distributions/self_hosted_distro/starter.html)      |
 |                Meta Reference                 |           [llamastack/distribution-meta-reference-gpu](https://hub.docker.com/repository/docker/llamastack/distribution-meta-reference-gpu/general)           |      [Guide](https://llamastack.github.io/latest/distributions/self_hosted_distro/meta-reference-gpu.html)      |
 |                   PostgreSQL                  |                [llamastack/distribution-postgres-demo](https://hub.docker.com/repository/docker/llamastack/distribution-postgres-demo/general)                |                  |
 ### Documentation
 Please checkout our [Documentation](https://llamastack.github.io/latest/index.html) page for more details.
 * CLI references
    * [llama (server-side) CLI Reference](https://llamastack.github.io/latest/references/llama_cli_reference/index.html): Guide for using the `llama` CLI to work with Llama models (download, study prompts), and building/starting a Llama Stack distribution.
    * [llama (client-side) CLI Reference](https://llamastack.github.io/latest/references/llama_stack_client_cli_reference.html): Guide for using the `llama-stack-client` CLI, which allows you to query information about the distribution.
 * Getting Started
    * [Quick guide to start a Llama Stack server](https://llamastack.github.io/latest/getting_started/index.html).
    * [Jupyter notebook](./docs/getting_started.ipynb) to walk-through how to use simple text and vision inference llama_stack_client APIs
    * The complete Llama Stack lesson [Colab notebook](https://colab.research.google.com/drive/1dtVmxotBsI4cGZQNsJRYPrLiDeT0Wnwt) of the new [Llama 3.2 course on Deeplearning.ai](https://learn.deeplearning.ai/courses/introducing-multimodal-llama-3-2/lesson/8/llama-stack).
    * A [Zero-to-Hero Guide](https://github.com/meta-llama/llama-stack/tree/main/docs/zero_to_hero_guide) that guide you through all the key components of llama stack with code samples.
 * [Contributing](CONTRIBUTING.md)
    * [Adding a new API Provider](https://llamastack.github.io/latest/contributing/new_api_provider.html) to walk-through how to add a new API provider.
 ### Llama Stack Client SDKs
 |  **Language** |  **Client SDK** | **Package** |
 | :----: | :----: | :----: |
 | Python |  [llama-stack-client-python](https://github.com/meta-llama/llama-stack-client-python) | [![PyPI version](https://img.shields.io/pypi/v/llama_stack_client.svg)](https://pypi.org/project/llama_stack_client/)
 | Swift  | [llama-stack-client-swift](https://github.com/meta-llama/llama-stack-client-swift) | [![Swift Package Index](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2Fmeta-llama%2Fllama-stack-client-swift%2Fbadge%3Ftype%3Dswift-versions)](https://swiftpackageindex.com/meta-llama/llama-stack-client-swift)
 | Typescript   | [llama-stack-client-typescript](https://github.com/meta-llama/llama-stack-client-typescript) | [![NPM version](https://img.shields.io/npm/v/llama-stack-client.svg)](https://npmjs.org/package/llama-stack-client)
 | Kotlin | [llama-stack-client-kotlin](https://github.com/meta-llama/llama-stack-client-kotlin) | [![Maven version](https://img.shields.io/maven-central/v/com.llama.llamastack/llama-stack-client-kotlin)](https://central.sonatype.com/artifact/com.llama.llamastack/llama-stack-client-kotlin)
 Check out our client SDKs for connecting to a Llama Stack server in your preferred language, you can choose from [python](https://github.com/meta-llama/llama-stack-client-python), [typescript](https://github.com/meta-llama/llama-stack-client-typescript), [swift](https://github.com/meta-llama/llama-stack-client-swift), and [kotlin](https://github.com/meta-llama/llama-stack-client-kotlin) programming languages to quickly build your applications.
 You can find more example scripts with client SDKs to talk with the Llama Stack server in our [llama-stack-apps](https://github.com/meta-llama/llama-stack-apps/tree/main/examples) repo.
 ## 🌟 GitHub Star History
 ## Star History
 [![Star History Chart](https://api.star-history.com/svg?repos=meta-llama/llama-stack&type=Date)](https://www.star-history.com/#meta-llama/llama-stack&Date)
 ## ✨ Contributors
 Thanks to all of our amazing contributors!
 <a href="https://github.com/meta-llama/llama-stack/graphs/contributors">
  <img src="https://contrib.rocks/image?repo=meta-llama/llama-stack" />
 </a>
--- a/docs/static/imported-files/usage.json
+++ b/docs/static/imported-files/usage.json
@ -1,6 +0,0 @@
 {
  "files": [
    "docs/getting_started/demo_script.py",
    "README.md"
  ]
 }