add more distro templates (#279)

* verify dockers * together distro verified * readme * fireworks distro * fireworks compose up * fireworks verified
2024-10-21 18:15:08 -07:00 · 2024-10-21 18:15:08 -07:00 · 4d2bd2d39e
commit 4d2bd2d39e
parent cf27d19dd5
18 changed files with 265 additions and 42 deletions
--- a/distributions/together/README.md
+++ b/distributions/together/README.md
@ -0,0 +1,68 @@
+# Together Distribution
+
+### Connect to a Llama Stack Together Endpoint
+- You may connect to a hosted endpoint `https://llama-stack.together.ai`, serving a Llama Stack distribution
+
+The `llamastack/distribution-together` distribution consists of the following provider configurations.
+
+
+| **API**         	| **Inference** 	| **Agents**     	| **Memory**                                       	| **Safety**     	| **Telemetry**  	|
+|-----------------	|---------------	|----------------	|--------------------------------------------------	|----------------	|----------------	|
+| **Provider(s)** 	| remote::together   	| meta-reference 	| remote::weaviate 	| meta-reference 	| meta-reference 	|
+
+
+### Start the Distribution (Single Node CPU)
+
+> [!NOTE]
+> This assumes you have an hosted endpoint at Together with API Key.
+
+```
+$ cd llama-stack/distribution/together
+$ ls
+compose.yaml  run.yaml
+$ docker compose up
+```
+
+Make sure in you `run.yaml` file, you inference provider is pointing to the correct Together URL server endpoint. E.g.
+```
+inference:
+  - provider_id: together
+    provider_type: remote::together
+    config:
+      url: https://api.together.xyz/v1
+      api_key: <optional api key>
+```
+
+### (Alternative) TGI server + llama stack run (Single Node GPU)
+
+```
+docker run --network host -it -p 5000:5000 -v ./run.yaml:/root/my-run.yaml --gpus=all llamastack/distribution-together --yaml_config /root/my-run.yaml
+```
+
+Make sure in you `run.yaml` file, you inference provider is pointing to the correct Together URL server endpoint. E.g.
+```
+inference:
+  - provider_id: together
+    provider_type: remote::together
+    config:
+      url: https://api.together.xyz/v1
+      api_key: <optional api key>
+```
+
+Together distribution comes with weaviate as Memory provider. We also need to configure the remote weaviate API key and URL in `run.yaml` to get memory API.
+```
+memory:
+  - provider_id: meta0
+    provider_type: remote::weaviate
+    config:
+      weaviate_api_key: <ENTER_WEAVIATE_API_KEY>
+      weaviate_cluster_url: <ENTER_WEAVIATE_CLUSTER_URL>
+```
+
+**Via Conda**
+
+```bash
+llama stack build --config ./build.yaml
+# -- modify run.yaml to a valid Together server endpoint
+llama stack run ./run.yaml
+```
--- a/distributions/together/build.yaml
+++ b/distributions/together/build.yaml
@ -3,8 +3,8 @@ distribution_spec:
  description: Use Together.ai for running LLM inference
  providers:
    inference: remote::together
-    memory: meta-reference
+    memory: remote::weaviate
    safety: remote::together
    agents: meta-reference
    telemetry: meta-reference
-image_type: conda
+image_type: docker
--- a/distributions/together/compose.yaml
+++ b/distributions/together/compose.yaml
@ -0,0 +1,18 @@
+services:
+  llamastack:
+    image: llamastack/distribution-together
+    network_mode: "host"
+    volumes:
+      - ~/.llama:/root/.llama
+      # Link to ollama run.yaml file
+      - ./run.yaml:/root/llamastack-run-together.yaml
+    ports:
+      - "5000:5000"
+    # Hack: wait for ollama server to start before starting docker
+    entrypoint: bash -c "python -m llama_stack.distribution.server.server --yaml_config /root/llamastack-run-together.yaml"
+    deploy:
+      restart_policy:
+        condition: on-failure
+        delay: 3s
+        max_attempts: 5
+        window: 60s
--- a/distributions/together/run.yaml
+++ b/distributions/together/run.yaml
@ -0,0 +1,42 @@
+version: '2'
+built_at: '2024-10-08T17:40:45.325529'
+image_name: local
+docker_image: null
+conda_env: local
+apis:
+- shields
+- agents
+- models
+- memory
+- memory_banks
+- inference
+- safety
+providers:
+  inference:
+  - provider_id: together0
+    provider_type: remote::together
+    config:
+      url: https://api.together.xyz/v1
+  safety:
+  - provider_id: together0
+    provider_type: remote::together
+    config:
+      url: https://api.together.xyz/v1
+  memory:
+  - provider_id: meta0
+    provider_type: remote::weaviate
+    config:
+      weaviate_api_key: <ENTER_WEAVIATE_API_KEY>
+      weaviate_cluster_url: <ENTER_WEAVIATE_CLUSTER_URL>
+  agents:
+  - provider_id: meta0
+    provider_type: meta-reference
+    config:
+      persistence_store:
+        namespace: null
+        type: sqlite
+        db_path: ~/.llama/runtime/kvstore.db
+  telemetry:
+  - provider_id: meta0
+    provider_type: meta-reference
+    config: {}