diff --git a/docs/source/getting_started/distributions/self_hosted_distro/index.md b/docs/source/getting_started/distributions/self_hosted_distro/index.md
index ed6ab5d7f..502b95cb4 100644
--- a/docs/source/getting_started/distributions/self_hosted_distro/index.md
+++ b/docs/source/getting_started/distributions/self_hosted_distro/index.md
@@ -23,5 +23,6 @@ tgi
 dell-tgi
 together
 fireworks
+remote-vllm
 bedrock
 ```
diff --git a/docs/source/getting_started/index.md b/docs/source/getting_started/index.md
index 189bd6cb5..6400fb285 100644
--- a/docs/source/getting_started/index.md
+++ b/docs/source/getting_started/index.md
@@ -53,9 +53,9 @@ Please see our pages in detail for the types of distributions we offer:
 3. [On-device Distribution](./distributions/ondevice_distro/index.md): If you want to run Llama Stack inference on your iOS / Android device.
 
 
-### Quick Start Commands
+### Table of Contents
 
-Once you have decided on the inference provider and distribution to use, use the following quick start commands to get started.
+Once you have decided on the inference provider and distribution to use, use the following guides to get started.
 
 ##### 1.0 Prerequisite
 
@@ -109,421 +109,32 @@ Access to Single-Node CPU with Fireworks hosted endpoint via API_KEY from [firew
 
 ##### 1.1. Start the distribution
 
-**(Option 1) Via Docker**
-::::{tab-set}
-
 :::{tab-item} meta-reference-gpu
-```
-$ cd llama-stack/distributions/meta-reference-gpu && docker compose up
-```
-
-This will download and start running a pre-built Docker container. Alternatively, you may use the following commands:
-
-```
-docker run -it -p 5000:5000 -v ~/.llama:/root/.llama -v ./run.yaml:/root/my-run.yaml --gpus=all distribution-meta-reference-gpu --yaml_config /root/my-run.yaml
-```
+[Start Meta Reference GPU Distribution](https://llama-stack.readthedocs.io/en/latest/getting_started/distributions/self_hosted_distro/meta-reference-gpu.html)
 :::
 
 :::{tab-item} vLLM
-```
-$ cd llama-stack/distributions/remote-vllm && docker compose up
-```
-
-The script will first start up vLLM server on port 8000, then start up Llama Stack distribution server hooking up to it for inference. You should see the following outputs --
-```
-<TO BE FILLED>
-```
-
-To kill the server
-```
-docker compose down
-```
+[Start vLLM Distribution](https://llama-stack.readthedocs.io/en/latest/getting_started/distributions/self_hosted_distro/remote-vllm.html)
 :::
 
 :::{tab-item} tgi
-```
-$ cd llama-stack/distributions/tgi && docker compose up
-```
-
-The script will first start up TGI server, then start up Llama Stack distribution server hooking up to the remote TGI provider for inference. You should see the following outputs --
-```
-[text-generation-inference] | 2024-10-15T18:56:33.810397Z  INFO text_generation_router::server: router/src/server.rs:1813: Using config Some(Llama)
-[text-generation-inference] | 2024-10-15T18:56:33.810448Z  WARN text_generation_router::server: router/src/server.rs:1960: Invalid hostname, defaulting to 0.0.0.0
-[text-generation-inference] | 2024-10-15T18:56:33.864143Z  INFO text_generation_router::server: router/src/server.rs:2353: Connected
-INFO:     Started server process [1]
-INFO:     Waiting for application startup.
-INFO:     Application startup complete.
-INFO:     Uvicorn running on http://[::]:5000 (Press CTRL+C to quit)
-```
-
-To kill the server
-```
-docker compose down
-```
-:::
-
-
-:::{tab-item} ollama
-```
-$ cd llama-stack/distributions/ollama && docker compose up
-
-# OR
-
-$ cd llama-stack/distributions/ollama-gpu && docker compose up
-```
-
-You will see outputs similar to following ---
-```
-[ollama]               | [GIN] 2024/10/18 - 21:19:41 | 200 |     226.841µs |             ::1 | GET      "/api/ps"
-[ollama]               | [GIN] 2024/10/18 - 21:19:42 | 200 |      60.908µs |             ::1 | GET      "/api/ps"
-INFO:     Started server process [1]
-INFO:     Waiting for application startup.
-INFO:     Application startup complete.
-INFO:     Uvicorn running on http://[::]:5000 (Press CTRL+C to quit)
-[llamastack] | Resolved 12 providers
-[llamastack] |  inner-inference => ollama0
-[llamastack] |  models => __routing_table__
-[llamastack] |  inference => __autorouted__
-```
-
-To kill the server
-```
-docker compose down
-```
-:::
-
-:::{tab-item} fireworks
-```
-$ cd llama-stack/distributions/fireworks && docker compose up
-```
-
-Make sure your `run.yaml` file has the inference provider pointing to the correct Fireworks URL server endpoint. E.g.
-```
-inference:
-  - provider_id: fireworks
-    provider_type: remote::fireworks
-    config:
-      url: https://api.fireworks.ai/inference
-      api_key: <optional api key>
-```
-:::
-
-:::{tab-item} together
-```
-$ cd distributions/together && docker compose up
-```
-
-Make sure your `run.yaml` file has the inference provider pointing to the correct Together URL server endpoint. E.g.
-```
-inference:
-  - provider_id: together
-    provider_type: remote::together
-    config:
-      url: https://api.together.xyz/v1
-      api_key: <optional api key>
-```
-:::
-
-
-::::
-
-**(Option 2) Via Conda**
-
-::::{tab-set}
-
-:::{tab-item} meta-reference-gpu
-1. Install the `llama` CLI. See [CLI Reference](https://llama-stack.readthedocs.io/en/latest/cli_reference/index.html)
-
-2. Build the `meta-reference-gpu` distribution
-
-```
-$ llama stack build --template meta-reference-gpu --image-type conda
-```
-
-3. Start running distribution
-```
-$ llama stack run ~/.llama/distributions/llamastack-meta-reference-gpu/meta-reference-gpu-run.yaml
-```
-
-Note: If you wish to use pgvector or chromadb as memory provider. You may need to update generated `run.yaml` file to point to the desired memory provider. See [Memory Providers](https://llama-stack.readthedocs.io/en/latest/api_providers/memory_api.html) for more details. Or comment out the pgvector or chromadb memory provider in `run.yaml` file to use the default inline memory provider, keeping only the following section:
-```
-memory:
-  - provider_id: faiss-0
-    provider_type: faiss
-    config:
-      kvstore:
-        namespace: null
-        type: sqlite
-        db_path: ~/.llama/runtime/faiss_store.db
-```
-
-:::
-
-:::{tab-item} tgi
-1. Install the `llama` CLI. See [CLI Reference](https://llama-stack.readthedocs.io/en/latest/cli_reference/index.html)
-
-2. Build the `tgi` distribution
-
-```bash
-llama stack build --template tgi --image-type conda
-```
-
-3. Start a TGI server endpoint
-
-4. Make sure in your `run.yaml` file, your `conda_env` is pointing to the conda environment and inference provider is pointing to the correct TGI server endpoint. E.g.
-```
-conda_env: llamastack-tgi
-...
-inference:
-  - provider_id: tgi0
-    provider_type: remote::tgi
-    config:
-      url: http://127.0.0.1:5009
-```
-
-5. Start Llama Stack server
-```bash
-$ llama stack run ~/.llama/distributions/llamastack-tgi/tgi-run.yaml
-```
-
-Note: If you wish to use pgvector or chromadb as memory provider. You may need to update generated `run.yaml` file to point to the desired memory provider. See [Memory Providers](https://llama-stack.readthedocs.io/en/latest/api_providers/memory_api.html) for more details. Or comment out the pgvector or chromadb memory provider in `run.yaml` file to use the default inline memory provider, keeping only the following section:
-```
-memory:
-  - provider_id: faiss-0
-    provider_type: faiss
-    config:
-      kvstore:
-        namespace: null
-        type: sqlite
-        db_path: ~/.llama/runtime/faiss_store.db
-```
+[Start TGI Distribution](https://llama-stack.readthedocs.io/en/latest/getting_started/distributions/self_hosted_distro/tgi.html)
 :::
 
 :::{tab-item} ollama
-
-If you wish to separately spin up a Ollama server, and connect with Llama Stack, you may use the following commands.
-
-#### Start Ollama server.
-- Please check the [Ollama Documentations](https://github.com/ollama/ollama) for more details.
-
-**Via Docker**
-```
-docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
-```
-
-**Via CLI**
-```
-ollama run <model_id>
-```
-
-#### Start Llama Stack server pointing to Ollama server
-
-Make sure your `run.yaml` file has the inference provider pointing to the correct Ollama endpoint. E.g.
-```
-conda_env: llamastack-ollama
-...
-inference:
-  - provider_id: ollama0
-    provider_type: remote::ollama
-    config:
-      url: http://127.0.0.1:11434
-```
-
-```
-llama stack build --template ollama --image-type conda
-llama stack run ~/.llama/distributions/llamastack-ollama/ollama-run.yaml
-```
-
-Note: If you wish to use pgvector or chromadb as memory provider. You may need to update generated `run.yaml` file to point to the desired memory provider. See [Memory Providers](https://llama-stack.readthedocs.io/en/latest/api_providers/memory_api.html) for more details. Or comment out the pgvector or chromadb memory provider in `run.yaml` file to use the default inline memory provider, keeping only the following section:
-```
-memory:
-  - provider_id: faiss-0
-    provider_type: faiss
-    config:
-      kvstore:
-        namespace: null
-        type: sqlite
-        db_path: ~/.llama/runtime/faiss_store.db
-```
-
-:::
-
-:::{tab-item} fireworks
-
-```bash
-llama stack build --template fireworks --image-type conda
-# -- modify run.yaml to a valid Fireworks server endpoint
-llama stack run ./run.yaml
-```
-
-Make sure your `run.yaml` file has the inference provider pointing to the correct Fireworks URL server endpoint. E.g.
-```
-conda_env: llamastack-fireworks
-...
-inference:
-  - provider_id: fireworks
-    provider_type: remote::fireworks
-    config:
-      url: https://api.fireworks.ai/inference
-      api_key: <optional api key>
-```
+[Start Ollama Distribution](https://llama-stack.readthedocs.io/en/latest/getting_started/distributions/self_hosted_distro/ollama.html)
 :::
 
 :::{tab-item} together
-
-```bash
-llama stack build --template together --image-type conda
-# -- modify run.yaml to a valid Together server endpoint
-llama stack run ~/.llama/distributions/llamastack-together/together-run.yaml
-```
-
-Make sure your `run.yaml` file has the inference provider pointing to the correct Together URL server endpoint. E.g.
-```
-conda_env: llamastack-together
-...
-inference:
-  - provider_id: together
-    provider_type: remote::together
-    config:
-      url: https://api.together.xyz/v1
-      api_key: <optional api key>
-```
-:::
-
-::::
-
-##### 1.2 (Optional) Update Model Serving Configuration
-::::{tab-set}
-
-:::{tab-item} meta-reference-gpu
-You may change the `config.model` in `run.yaml` to update the model currently being served by the distribution. Make sure you have the model checkpoint downloaded in your `~/.llama`.
-```
-inference:
-  - provider_id: meta0
-    provider_type: inline::meta-reference
-    config:
-      model: Llama3.2-11B-Vision-Instruct
-      quantization: null
-      torch_seed: null
-      max_seq_len: 4096
-      max_batch_size: 1
-```
-
-Run `llama model list` to see the available models to download, and `llama model download` to download the checkpoints.
-:::
-
-:::{tab-item} tgi
-To serve a new model with `tgi`, change the docker command flag `--model-id <model-to-serve>`.
-
-This can be done by edit the `command` args in `compose.yaml`. E.g. Replace "Llama-3.2-1B-Instruct" with the model you want to serve.
-
-```
-command: ["--dtype", "bfloat16", "--usage-stats", "on", "--sharded", "false", "--model-id", "meta-llama/Llama-3.2-1B-Instruct", "--port", "5009", "--cuda-memory-fraction", "0.3"]
-```
-
-or by changing the docker run command's `--model-id` flag
-```
-docker run --rm -it -v $HOME/.cache/huggingface:/data -p 5009:5009 --gpus all ghcr.io/huggingface/text-generation-inference:latest --dtype bfloat16 --usage-stats on --sharded false --model-id meta-llama/Llama-3.2-1B-Instruct --port 5009
-```
-
-Make sure your `run.yaml` file has the inference provider pointing to the TGI server endpoint serving your model.
-```
-inference:
-  - provider_id: tgi0
-    provider_type: remote::tgi
-    config:
-      url: http://127.0.0.1:5009
-```
-```
-
-Run `llama model list` to see the available models to download, and `llama model download` to download the checkpoints.
-:::
-
-:::{tab-item} ollama
-You can use ollama for managing model downloads.
-
-```
-ollama pull llama3.1:8b-instruct-fp16
-ollama pull llama3.1:70b-instruct-fp16
-```
-
-> Please check the [OLLAMA_SUPPORTED_MODELS](https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers.remote/inference/ollama/ollama.py) for the supported Ollama models.
-
-
-To serve a new model with `ollama`
-```
-ollama run <model_name>
-```
-
-To make sure that the model is being served correctly, run `ollama ps` to get a list of models being served by ollama.
-```
-$ ollama ps
-
-NAME                         ID              SIZE     PROCESSOR    UNTIL
-llama3.1:8b-instruct-fp16    4aacac419454    17 GB    100% GPU     4 minutes from now
-```
-
-To verify that the model served by ollama is correctly connected to Llama Stack server
-```
-$ llama-stack-client models list
-+----------------------+----------------------+---------------+-----------------------------------------------+
-| identifier           | llama_model          | provider_id   | metadata                                      |
-+======================+======================+===============+===============================================+
-| Llama3.1-8B-Instruct | Llama3.1-8B-Instruct | ollama0       | {'ollama_model': 'llama3.1:8b-instruct-fp16'} |
-+----------------------+----------------------+---------------+-----------------------------------------------+
-```
-:::
-
-:::{tab-item} together
-Use `llama-stack-client models list` to check the available models served by together.
-
-```
-$ llama-stack-client models list
-+------------------------------+------------------------------+---------------+------------+
-| identifier                   | llama_model                  | provider_id   | metadata   |
-+==============================+==============================+===============+============+
-| Llama3.1-8B-Instruct         | Llama3.1-8B-Instruct         | together0     | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.1-70B-Instruct        | Llama3.1-70B-Instruct        | together0     | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.1-405B-Instruct       | Llama3.1-405B-Instruct       | together0     | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.2-3B-Instruct         | Llama3.2-3B-Instruct         | together0     | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.2-11B-Vision-Instruct | Llama3.2-11B-Vision-Instruct | together0     | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.2-90B-Vision-Instruct | Llama3.2-90B-Vision-Instruct | together0     | {}         |
-+------------------------------+------------------------------+---------------+------------+
-```
+[Start Together Distribution](https://llama-stack.readthedocs.io/en/latest/getting_started/distributions/self_hosted_distro/together.html)
 :::
 
 :::{tab-item} fireworks
-Use `llama-stack-client models list` to check the available models served by Fireworks.
-```
-$ llama-stack-client models list
-+------------------------------+------------------------------+---------------+------------+
-| identifier                   | llama_model                  | provider_id   | metadata   |
-+==============================+==============================+===============+============+
-| Llama3.1-8B-Instruct         | Llama3.1-8B-Instruct         | fireworks0    | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.1-70B-Instruct        | Llama3.1-70B-Instruct        | fireworks0    | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.1-405B-Instruct       | Llama3.1-405B-Instruct       | fireworks0    | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.2-1B-Instruct         | Llama3.2-1B-Instruct         | fireworks0    | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.2-3B-Instruct         | Llama3.2-3B-Instruct         | fireworks0    | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.2-11B-Vision-Instruct | Llama3.2-11B-Vision-Instruct | fireworks0    | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.2-90B-Vision-Instruct | Llama3.2-90B-Vision-Instruct | fireworks0    | {}         |
-+------------------------------+------------------------------+---------------+------------+
-```
+[Start Fireworks Distribution](https://llama-stack.readthedocs.io/en/latest/getting_started/distributions/self_hosted_distro/fireworks.html)
 :::
 
 ::::
 
-
 ##### Troubleshooting
 - If you encounter any issues, search through our [GitHub Issues](https://github.com/meta-llama/llama-stack/issues), or file an new issue.
 - Use `--port <PORT>` flag to use a different port number. For docker run, update the `-p <PORT>:<PORT>` flag.