llama-stack-mirror/distributions/together
Martin Hickey 73e33fb747 Fix conda env names in distribution example run template
The Self-Hosted Distribution documentation contain steps
to start llama server via conda environment. If the user
generates conda environment using llaama build command and template,
it generate an environment with name of the distribution and not
defaulyt name of local as per the example run yaml file. It
will thereore fail when user tried to run the server.

This PR fixes the conda env name in the run yaml file.

Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>
2024-11-15 15:32:52 +00:00
..
build.yaml fix broken --list-templates with adding build.yaml files for packaging (#327) 2024-10-25 12:51:22 -07:00
compose.yaml update distributions/readmes 2024-10-28 15:10:40 -07:00
README.md [docs] update documentations (#356) 2024-11-04 16:52:38 -08:00
run.yaml Fix conda env names in distribution example run template 2024-11-15 15:32:52 +00:00

Together Distribution

Connect to a Llama Stack Together Endpoint

  • You may connect to a hosted endpoint https://llama-stack.together.ai, serving a Llama Stack distribution

The llamastack/distribution-together distribution consists of the following provider configurations.

API Inference Agents Memory Safety Telemetry
Provider(s) remote::together meta-reference meta-reference, remote::weaviate meta-reference meta-reference

Docker: Start the Distribution (Single Node CPU)

Note

This assumes you have an hosted endpoint at Together with API Key.

$ cd distributions/together
$ ls
compose.yaml  run.yaml
$ docker compose up

Make sure in you run.yaml file, you inference provider is pointing to the correct Together URL server endpoint. E.g.

inference:
  - provider_id: together
    provider_type: remote::together
    config:
      url: https://api.together.xyz/v1
      api_key: <optional api key>

Conda llama stack run (Single Node CPU)

llama stack build --template together --image-type conda
# -- modify run.yaml to a valid Together server endpoint
llama stack run ./run.yaml

(Optional) Update Model Serving Configuration

Use llama-stack-client models list to check the available models served by together.

$ llama-stack-client models list
+------------------------------+------------------------------+---------------+------------+
| identifier                   | llama_model                  | provider_id   | metadata   |
+==============================+==============================+===============+============+
| Llama3.1-8B-Instruct         | Llama3.1-8B-Instruct         | together0     | {}         |
+------------------------------+------------------------------+---------------+------------+
| Llama3.1-70B-Instruct        | Llama3.1-70B-Instruct        | together0     | {}         |
+------------------------------+------------------------------+---------------+------------+
| Llama3.1-405B-Instruct       | Llama3.1-405B-Instruct       | together0     | {}         |
+------------------------------+------------------------------+---------------+------------+
| Llama3.2-3B-Instruct         | Llama3.2-3B-Instruct         | together0     | {}         |
+------------------------------+------------------------------+---------------+------------+
| Llama3.2-11B-Vision-Instruct | Llama3.2-11B-Vision-Instruct | together0     | {}         |
+------------------------------+------------------------------+---------------+------------+
| Llama3.2-90B-Vision-Instruct | Llama3.2-90B-Vision-Instruct | together0     | {}         |
+------------------------------+------------------------------+---------------+------------+