Composable building blocks to build Llama Apps https://llama-stack.readthedocs.io
Find a file
2024-07-22 11:07:11 -07:00
llama_toolchain add llama model subcommand 2024-07-22 11:07:11 -07:00
.gitignore add llama model subcommand 2024-07-22 11:07:11 -07:00
fp8_requirements.txt cleanup for fp8 and requirements etc 2024-07-20 23:21:55 -07:00
LICENSE added pypi package 2024-07-21 13:43:36 -07:00
pyproject.toml added pypi package 2024-07-21 13:43:36 -07:00
README.md update README a bit 2024-07-20 23:26:50 -07:00
requirements.txt use default_config file to configure inference 2024-07-21 19:26:11 -07:00
setup.py rename toolchain/ --> llama_toolchain/ 2024-07-21 23:48:38 -07:00

This repo contains the API specifications for various parts of the Llama Stack. The Stack consists of toolchain-apis and agentic-apis.

The tool chain apis that are covered --

  • inference / batch inference
  • post training
  • reward model scoring
  • synthetic data generation

Running FP8

You need fbgemm-gpu package which requires torch >= 2.4.0 (currently only in nightly, but releasing shortly...).

ENV=fp8_env
conda create -n $ENV python=3.10
conda activate $ENV

pip3 install -r fp8_requirements.txt

Generate OpenAPI specs

Set up virtual environment

python3 -m venv ~/.venv/toolchain/ 
source ~/.venv/toolchain/bin/activate

with-proxy pip3 install -r requirements.txt 

Run the generate.sh script

cd source && sh generate.sh