mirror of https://github.com/meta-llama/llama-stack.git synced 2025-06-27 18:50:41 +00:00

Dalton Flanagan ec433448f2

* Add CLI reference doc

* touchups

* add helptext for download

2024-07-25 13:56:29 -07:00

1.1 KiB

Raw Blame History

llama-toolchain

This repo contains the API specifications for various components of the Llama Stack as well implementations for some of those APIs like model inference.

The Llama Stack consists of toolchain-apis and agentic-apis. This repo contains the toolchain-apis.

Installation

You can install this repository as a package with pip install llama-toolchain

If you want to install from source:

mkdir -p ~/local
cd ~/local
git clone git@github.com:meta-llama/llama-toolchain.git

conda create -n toolchain python=3.10
conda activate toolchain

cd llama-toolchain
pip install -e .

The Llama CLI

The llama CLI makes it easy to configure and run the Llama toolchain. Read the CLI reference for details.

Appendix: Running FP8

If you want to run FP8, you need the fbgemm-gpu package which requires torch >= 2.4.0 (currently only in nightly, but releasing shortly...)

ENV=fp8_env
conda create -n $ENV python=3.10
conda activate $ENV

pip3 install -r fp8_requirements.txt

1.1 KiB Raw Blame History

llama-toolchain

Installation

The Llama CLI

Appendix: Running FP8

1.1 KiB

Raw Blame History