[docs] refactor remote-hosted distro (#402)

* move docs * docs
2025-12-04 02:03:44 +00:00 · 2024-11-07 19:16:38 -08:00 · 2024-11-07 19:16:38 -08:00 · 8350f2df4c
commit 8350f2df4c
parent 345ae07317
6 changed files with 44 additions and 9 deletions
--- a/docs/source/getting_started/distributions/remote_hosted_distro/bedrock.md
+++ b/docs/source/getting_started/distributions/remote_hosted_distro/bedrock.md
@ -1,58 +0,0 @@
-# Bedrock Distribution
-
-### Connect to a Llama Stack Bedrock Endpoint
- You may connect to Amazon Bedrock APIs for running LLM inference
-
-The `llamastack/distribution-bedrock` distribution consists of the following provider configurations.
-
-
-| **API**         	| **Inference** 	| **Agents**     	| **Memory**     	| **Safety**     	| **Telemetry**  	|
-|-----------------	|---------------	|----------------	|----------------	|----------------	|----------------	|
-| **Provider(s)** 	| remote::bedrock | meta-reference 	| meta-reference 	| remote::bedrock | meta-reference 	|
-
-
-### Docker: Start the Distribution (Single Node CPU)
-
-> [!NOTE]
-> This assumes you have valid AWS credentials configured with access to Amazon Bedrock.
-
-```
-$ cd distributions/bedrock && docker compose up
-```
-
-Make sure in your `run.yaml` file, your inference provider is pointing to the correct AWS configuration. E.g.
-```
-inference:
-  - provider_id: bedrock0
-    provider_type: remote::bedrock
-    config:
-      aws_access_key_id: <AWS_ACCESS_KEY_ID>
-      aws_secret_access_key: <AWS_SECRET_ACCESS_KEY>
-      aws_session_token: <AWS_SESSION_TOKEN>
-      region_name: <AWS_REGION>
-```
-
-### Conda llama stack run (Single Node CPU)
-
-```bash
-llama stack build --template bedrock --image-type conda
-# -- modify run.yaml with valid AWS credentials
-llama stack run ./run.yaml
-```
-
-### (Optional) Update Model Serving Configuration
-
-Use `llama-stack-client models list` to check the available models served by Amazon Bedrock.
-
-```
-$ llama-stack-client models list
-+------------------------------+------------------------------+---------------+------------+
-| identifier                   | llama_model                  | provider_id   | metadata   |
-+==============================+==============================+===============+============+
-| Llama3.1-8B-Instruct         | meta.llama3-1-8b-instruct-v1:0 | bedrock0     | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.1-70B-Instruct        | meta.llama3-1-70b-instruct-v1:0 | bedrock0     | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.1-405B-Instruct       | meta.llama3-1-405b-instruct-v1:0 | bedrock0     | {}         |
-+------------------------------+------------------------------+---------------+------------+
-```
--- a/docs/source/getting_started/distributions/remote_hosted_distro/fireworks.md
+++ b/docs/source/getting_started/distributions/remote_hosted_distro/fireworks.md
@ -1,64 +0,0 @@
-# Fireworks Distribution
-
-The `llamastack/distribution-fireworks` distribution consists of the following provider configurations.
-
-
-| **API**         	| **Inference** 	| **Agents**     	| **Memory**                                       	| **Safety**     	| **Telemetry**  	|
-|-----------------	|---------------	|----------------	|--------------------------------------------------	|----------------	|----------------	|
-| **Provider(s)** 	| remote::fireworks   	| meta-reference 	| meta-reference 	| meta-reference 	| meta-reference 	|
-
-### Step 0. Prerequisite
- Make sure you have access to a fireworks API Key. You can get one by visiting [fireworks.ai](https://fireworks.ai/)
-
-### Step 1. Start the Distribution (Single Node CPU)
-
-#### (Option 1) Start Distribution Via Docker
-> [!NOTE]
-> This assumes you have an hosted endpoint at Fireworks with API Key.
-
-```
-$ cd distributions/fireworks && docker compose up
-```
-
-Make sure in you `run.yaml` file, you inference provider is pointing to the correct Fireworks URL server endpoint. E.g.
-```
-inference:
-  - provider_id: fireworks
-    provider_type: remote::fireworks
-    config:
-      url: https://api.fireworks.ai/inference
-      api_key: <optional api key>
-```
-
-#### (Option 2) Start Distribution Via Conda
-
-```bash
-llama stack build --template fireworks --image-type conda
-# -- modify run.yaml to a valid Fireworks server endpoint
-llama stack run ./run.yaml
-```
-
-
-### (Optional) Model Serving
-
-Use `llama-stack-client models list` to check the available models served by Fireworks.
-```
-$ llama-stack-client models list
-+------------------------------+------------------------------+---------------+------------+
-| identifier                   | llama_model                  | provider_id   | metadata   |
-+==============================+==============================+===============+============+
-| Llama3.1-8B-Instruct         | Llama3.1-8B-Instruct         | fireworks0    | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.1-70B-Instruct        | Llama3.1-70B-Instruct        | fireworks0    | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.1-405B-Instruct       | Llama3.1-405B-Instruct       | fireworks0    | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.2-1B-Instruct         | Llama3.2-1B-Instruct         | fireworks0    | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.2-3B-Instruct         | Llama3.2-3B-Instruct         | fireworks0    | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.2-11B-Vision-Instruct | Llama3.2-11B-Vision-Instruct | fireworks0    | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.2-90B-Vision-Instruct | Llama3.2-90B-Vision-Instruct | fireworks0    | {}         |
-+------------------------------+------------------------------+---------------+------------+
-```
--- a/docs/source/getting_started/distributions/remote_hosted_distro/index.md
+++ b/docs/source/getting_started/distributions/remote_hosted_distro/index.md
@ -1,15 +1,42 @@
 # Remote-Hosted Distribution

-Remote Hosted distributions are distributions connecting to remote hosted services through Llama Stack server. Inference is done through remote providers. These are useful if you have an API key for a remote inference provider like Fireworks, Together, etc.
+Remote-Hosted distributions are available endpoints serving Llama Stack API that you can directly connect to.

-| **Distribution** 	|           **Llama Stack Docker**           	| Start This Distribution 	|    **Inference**   	|     **Agents**     	|     **Memory**     	|     **Safety**     	|    **Telemetry**   	|
-|:----------------:	|:------------------------------------------:	|:-----------------------:	|:------------------:	|:------------------:	|:------------------:	|:------------------:	|:------------------:	|
-|        Together       	|         [llamastack/distribution-together](https://hub.docker.com/repository/docker/llamastack/distribution-together/general)        	|       [Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/distributions/remote_hosted_distro/together.html)       	| remote::together 	| meta-reference | remote::weaviate | meta-reference 	| meta-reference  	|
-|        Fireworks       	|         [llamastack/distribution-fireworks](https://hub.docker.com/repository/docker/llamastack/distribution-fireworks/general)        	|       [Guide](https://llama-stack.readthedocs.io/en/latest/getting_started/distributions/remote_hosted_distro/fireworks.html)       	| remote::fireworks 	| meta-reference | remote::weaviate | meta-reference 	| meta-reference  	|
+| Distribution | Endpoint | Inference | Agents | Memory | Safety | Telemetry |
+|-------------|----------|-----------|---------|---------|---------|------------|
+| Together | [https://llama-stack.together.ai](https://llama-stack.together.ai) | remote::together | meta-reference | remote::weaviate | meta-reference | meta-reference |
+| Fireworks | [https://llamastack-preview.fireworks.ai](https://llamastack-preview.fireworks.ai) | remote::fireworks | meta-reference | remote::weaviate | meta-reference | meta-reference |

-```{toctree}
-:maxdepth: 1
+## Connecting to Remote-Hosted Distributions

-fireworks
-together
+You can use `llama-stack-client` to interact with these endpoints. For example, to list the available models served by the Fireworks endpoint:
+
+```bash
+$ pip install llama-stack-client
+$ llama-stack-client configure --endpoint https://llamastack-preview.fireworks.ai
+$ llama-stack-client models list
 ```
+
+You will see outputs:
+```
+$ llama-stack-client models list
+------------------------------+------------------------------+---------------+------------+
+| identifier                   | llama_model                  | provider_id   | metadata   |
+==============================+==============================+===============+============+
+| Llama3.1-8B-Instruct         | Llama3.1-8B-Instruct         | fireworks0    | {}         |
+------------------------------+------------------------------+---------------+------------+
+| Llama3.1-70B-Instruct        | Llama3.1-70B-Instruct        | fireworks0    | {}         |
+------------------------------+------------------------------+---------------+------------+
+| Llama3.1-405B-Instruct       | Llama3.1-405B-Instruct       | fireworks0    | {}         |
+------------------------------+------------------------------+---------------+------------+
+| Llama3.2-1B-Instruct         | Llama3.2-1B-Instruct         | fireworks0    | {}         |
+------------------------------+------------------------------+---------------+------------+
+| Llama3.2-3B-Instruct         | Llama3.2-3B-Instruct         | fireworks0    | {}         |
+------------------------------+------------------------------+---------------+------------+
+| Llama3.2-11B-Vision-Instruct | Llama3.2-11B-Vision-Instruct | fireworks0    | {}         |
+------------------------------+------------------------------+---------------+------------+
+| Llama3.2-90B-Vision-Instruct | Llama3.2-90B-Vision-Instruct | fireworks0    | {}         |
+------------------------------+------------------------------+---------------+------------+
+```
+
+Checkout the [llama-stack-client-python](https://github.com/meta-llama/llama-stack-client-python/blob/main/docs/cli_reference.md) repo for more details on how to use the `llama-stack-client` CLI. Checkout [llama-stack-app](https://github.com/meta-llama/llama-stack-apps/tree/main) for examples applications built on top of Llama Stack.
--- a/docs/source/getting_started/distributions/remote_hosted_distro/together.md
+++ b/docs/source/getting_started/distributions/remote_hosted_distro/together.md
@ -1,62 +0,0 @@
-# Together Distribution
-
-### Connect to a Llama Stack Together Endpoint
- You may connect to a hosted endpoint `https://llama-stack.together.ai`, serving a Llama Stack distribution
-
-The `llamastack/distribution-together` distribution consists of the following provider configurations.
-
-
-| **API**         	| **Inference** 	| **Agents**     	| **Memory**                                       	| **Safety**     	| **Telemetry**  	|
-|-----------------	|---------------	|----------------	|--------------------------------------------------	|----------------	|----------------	|
-| **Provider(s)** 	| remote::together   	| meta-reference 	| meta-reference, remote::weaviate 	| meta-reference 	| meta-reference 	|
-
-
-### Docker: Start the Distribution (Single Node CPU)
-
-> [!NOTE]
-> This assumes you have an hosted endpoint at Together with API Key.
-
-```
-$ cd distributions/together && docker compose up
-```
-
-Make sure in your `run.yaml` file, your inference provider is pointing to the correct Together URL server endpoint. E.g.
-```
-inference:
-  - provider_id: together
-    provider_type: remote::together
-    config:
-      url: https://api.together.xyz/v1
-      api_key: <optional api key>
-```
-
-### Conda llama stack run (Single Node CPU)
-
-```bash
-llama stack build --template together --image-type conda
-# -- modify run.yaml to a valid Together server endpoint
-llama stack run ./run.yaml
-```
-
-### (Optional) Update Model Serving Configuration
-
-Use `llama-stack-client models list` to check the available models served by together.
-
-```
-$ llama-stack-client models list
-+------------------------------+------------------------------+---------------+------------+
-| identifier                   | llama_model                  | provider_id   | metadata   |
-+==============================+==============================+===============+============+
-| Llama3.1-8B-Instruct         | Llama3.1-8B-Instruct         | together0     | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.1-70B-Instruct        | Llama3.1-70B-Instruct        | together0     | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.1-405B-Instruct       | Llama3.1-405B-Instruct       | together0     | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.2-3B-Instruct         | Llama3.2-3B-Instruct         | together0     | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.2-11B-Vision-Instruct | Llama3.2-11B-Vision-Instruct | together0     | {}         |
-+------------------------------+------------------------------+---------------+------------+
-| Llama3.2-90B-Vision-Instruct | Llama3.2-90B-Vision-Instruct | together0     | {}         |
-+------------------------------+------------------------------+---------------+------------+
-```