distro readmes with model serving instructions (#339)

* readme updates * quantied compose * dell tgi * config update * readme * update model serving readmes * update * update * config
2025-10-04 04:04:14 +00:00 · 2024-10-28 17:47:14 -07:00 · 2024-10-28 17:47:14 -07:00 · ae671eaf7a
commit ae671eaf7a
parent a70a4706fc
8 changed files with 136 additions and 4 deletions
--- a/distributions/fireworks/README.md
+++ b/distributions/fireworks/README.md
@ -43,7 +43,7 @@ inference:
    provider_type: remote::fireworks
    config:
      url: https://api.fireworks.ai/inference
-      api_key: <optional api key>
+      api_key: <enter your api key>
 ```

 **Via Conda**
@ -53,3 +53,27 @@ llama stack build --template fireworks --image-type conda
 # -- modify run.yaml to a valid Fireworks server endpoint
 llama stack run ./run.yaml
 ```
+
+### Model Serving
+
+Use `llama-stack-client models list` to chekc the available models served by Fireworks.
+```
+$ llama-stack-client models list
+------------------------------+------------------------------+---------------+------------+
+| identifier                   | llama_model                  | provider_id   | metadata   |
+==============================+==============================+===============+============+
+| Llama3.1-8B-Instruct         | Llama3.1-8B-Instruct         | fireworks0    | {}         |
+------------------------------+------------------------------+---------------+------------+
+| Llama3.1-70B-Instruct        | Llama3.1-70B-Instruct        | fireworks0    | {}         |
+------------------------------+------------------------------+---------------+------------+
+| Llama3.1-405B-Instruct       | Llama3.1-405B-Instruct       | fireworks0    | {}         |
+------------------------------+------------------------------+---------------+------------+
+| Llama3.2-1B-Instruct         | Llama3.2-1B-Instruct         | fireworks0    | {}         |
+------------------------------+------------------------------+---------------+------------+
+| Llama3.2-3B-Instruct         | Llama3.2-3B-Instruct         | fireworks0    | {}         |
+------------------------------+------------------------------+---------------+------------+
+| Llama3.2-11B-Vision-Instruct | Llama3.2-11B-Vision-Instruct | fireworks0    | {}         |
+------------------------------+------------------------------+---------------+------------+
+| Llama3.2-90B-Vision-Instruct | Llama3.2-90B-Vision-Instruct | fireworks0    | {}         |
+------------------------------+------------------------------+---------------+------------+
+```
--- a/distributions/fireworks/run.yaml
+++ b/distributions/fireworks/run.yaml
@ -17,6 +17,7 @@ providers:
    provider_type: remote::fireworks
    config:
      url: https://api.fireworks.ai/inference
+      # api_key: <ENTER_YOUR_API_KEY>
  safety:
  - provider_id: meta0
    provider_type: meta-reference
@ -32,6 +33,10 @@ providers:
  - provider_id: meta0
    provider_type: meta-reference
    config: {}
+  # Uncomment to use weaviate memory provider
+  # - provider_id: weaviate0
+  #   provider_type: remote::weaviate
+  #   config: {}
  agents:
  - provider_id: meta0
    provider_type: meta-reference