mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-17 14:32:36 +00:00

History

Martin Hickey 73e33fb747 Fix conda env names in distribution example run template The Self-Hosted Distribution documentation contain steps to start llama server via conda environment. If the user generates conda environment using llaama build command and template, it generate an environment with name of the distribution and not defaulyt name of local as per the example run yaml file. It will thereore fail when user tried to run the server. This PR fixes the conda env name in the run yaml file. Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com>		2024-11-15 15:32:52 +00:00
..
build.yaml	fix broken --list-templates with adding build.yaml files for packaging (#327 )	2024-10-25 12:51:22 -07:00
compose.yaml	update distributions/readmes	2024-10-28 15:10:40 -07:00
README.md	[docs] update documentations (#356 )	2024-11-04 16:52:38 -08:00
run.yaml	Fix conda env names in distribution example run template	2024-11-15 15:32:52 +00:00

README.md

Together Distribution

Connect to a Llama Stack Together Endpoint

You may connect to a hosted endpoint https://llama-stack.together.ai, serving a Llama Stack distribution

The llamastack/distribution-together distribution consists of the following provider configurations.

API	Inference	Agents	Memory	Safety	Telemetry
Provider(s)	remote::together	meta-reference	meta-reference, remote::weaviate	meta-reference	meta-reference

Docker: Start the Distribution (Single Node CPU)

Note

This assumes you have an hosted endpoint at Together with API Key.

$ cd distributions/together
$ ls
compose.yaml  run.yaml
$ docker compose up

Make sure in you run.yaml file, you inference provider is pointing to the correct Together URL server endpoint. E.g.

inference:
  - provider_id: together
    provider_type: remote::together
    config:
      url: https://api.together.xyz/v1
      api_key: <optional api key>

Conda llama stack run (Single Node CPU)

llama stack build --template together --image-type conda
# -- modify run.yaml to a valid Together server endpoint
llama stack run ./run.yaml

(Optional) Update Model Serving Configuration

Use llama-stack-client models list to check the available models served by together.

$ llama-stack-client models list
+------------------------------+------------------------------+---------------+------------+
| identifier                   | llama_model                  | provider_id   | metadata   |
+==============================+==============================+===============+============+
| Llama3.1-8B-Instruct         | Llama3.1-8B-Instruct         | together0     | {}         |
+------------------------------+------------------------------+---------------+------------+
| Llama3.1-70B-Instruct        | Llama3.1-70B-Instruct        | together0     | {}         |
+------------------------------+------------------------------+---------------+------------+
| Llama3.1-405B-Instruct       | Llama3.1-405B-Instruct       | together0     | {}         |
+------------------------------+------------------------------+---------------+------------+
| Llama3.2-3B-Instruct         | Llama3.2-3B-Instruct         | together0     | {}         |
+------------------------------+------------------------------+---------------+------------+
| Llama3.2-11B-Vision-Instruct | Llama3.2-11B-Vision-Instruct | together0     | {}         |
+------------------------------+------------------------------+---------------+------------+
| Llama3.2-90B-Vision-Instruct | Llama3.2-90B-Vision-Instruct | together0     | {}         |
+------------------------------+------------------------------+---------------+------------+