fireworks distro

2025-12-09 19:29:18 +00:00 · 2024-10-21 18:11:52 -07:00 · 2024-10-21 18:11:52 -07:00 · c52ba17d7b
commit c52ba17d7b
parent 69d7011ed6
5 changed files with 107 additions and 5 deletions
--- a/distributions/fireworks/README.md
+++ b/distributions/fireworks/README.md
@ -0,0 +1,55 @@
+# Fireworks Distribution
+
+The `llamastack/distribution-` distribution consists of the following provider configurations.
+
+
+| **API**         	| **Inference** 	| **Agents**     	| **Memory**                                       	| **Safety**     	| **Telemetry**  	|
+|-----------------	|---------------	|----------------	|--------------------------------------------------	|----------------	|----------------	|
+| **Provider(s)** 	| remote::fireworks   	| meta-reference 	| meta-reference 	| meta-reference 	| meta-reference 	|
+
+
+### Start the Distribution (Single Node CPU)
+
+> [!NOTE]
+> This assumes you have an hosted endpoint at Fireworks with API Key.
+
+```
+$ cd llama-stack/distribution/fireworks
+$ ls
+compose.yaml  run.yaml
+$ docker compose up
+```
+
+Make sure in you `run.yaml` file, you inference provider is pointing to the correct Fireworks URL server endpoint. E.g.
+```
+inference:
+  - provider_id: fireworks
+    provider_type: remote::fireworks
+    config:
+      url: https://api.fireworks.ai/inferenc
+      api_key: <optional api key>
+```
+
+### (Alternative) TGI server + llama stack run (Single Node GPU)
+
+```
+docker run --network host -it -p 5000:5000 -v ./run.yaml:/root/my-run.yaml --gpus=all llamastack/distribution-fireworks --yaml_config /root/my-run.yaml
+```
+
+Make sure in you `run.yaml` file, you inference provider is pointing to the correct Fireworks URL server endpoint. E.g.
+```
+inference:
+  - provider_id: fireworks
+    provider_type: remote::fireworks
+    config:
+      url: https://api.fireworks.ai/inference
+      api_key: <optional api key>
+```
+
+**Via Conda**
+
+```bash
+llama stack build --config ./build.yaml
+# -- modify run.yaml to a valid Fireworks server endpoint
+llama stack run ./run.yaml
+```
--- a/distributions/fireworks/build.yaml
+++ b/distributions/fireworks/build.yaml
@ -7,4 +7,4 @@ distribution_spec:
    safety: meta-reference
    agents: meta-reference
    telemetry: meta-reference
-image_type: conda
+image_type: docker
--- a/distributions/fireworks/compose.yaml
+++ b/distributions/fireworks/compose.yaml
--- a/distributions/fireworks/run.yaml
+++ b/distributions/fireworks/run.yaml
@ -0,0 +1,46 @@
+version: '2'
+built_at: '2024-10-08T17:40:45.325529'
+image_name: local
+docker_image: null
+conda_env: local
+apis:
+- shields
+- agents
+- models
+- memory
+- memory_banks
+- inference
+- safety
+providers:
+  inference:
+  - provider_id: fireworks0
+    provider_type: remote::fireworks
+    config:
+      url: https://api.fireworks.ai/inference
+  safety:
+  - provider_id: meta0
+    provider_type: meta-reference
+    config:
+      llama_guard_shield:
+        model: Llama-Guard-3-1B
+        excluded_categories: []
+        disable_input_check: false
+        disable_output_check: false
+      prompt_guard_shield:
+        model: Prompt-Guard-86M
+  memory:
+  - provider_id: meta0
+    provider_type: meta-reference
+    config: {}
+  agents:
+  - provider_id: meta0
+    provider_type: meta-reference
+    config:
+      persistence_store:
+        namespace: null
+        type: sqlite
+        db_path: ~/.llama/runtime/kvstore.db
+  telemetry:
+  - provider_id: meta0
+    provider_type: meta-reference
+    config: {}
--- a/distributions/together/README.md
+++ b/distributions/together/README.md
@ -23,13 +23,14 @@ compose.yaml  run.yaml
 $ docker compose up
 ```

-Replace <ENTER_YOUR_TGI_HOSTED_ENDPOINT> in `run.yaml` file with your TGI endpoint.
+Make sure in you `run.yaml` file, you inference provider is pointing to the correct Together URL server endpoint. E.g.
 ```
 inference:
-  - provider_id: tgi0
-    provider_type: remote::tgi
+  - provider_id: together
+    provider_type: remote::together
    config:
-      url: <ENTER_YOUR_TGI_HOSTED_ENDPOINT>
+      url: https://api.together.xyz/v1
+      api_key: <optional api key>
 ```

 ### (Alternative) TGI server + llama stack run (Single Node GPU)