mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-07-29 07:14:20 +00:00
Composable building blocks to build Llama Apps
https://llama-stack.readthedocs.io
toolchain | ||
.gitignore | ||
fp8_requirements.txt | ||
README.md | ||
requirements.txt |
This repo contains the API specifications for various parts of the Llama Stack. The Stack consists of toolchain-apis and agentic-apis.
The tool chain apis that are covered --
- inference / batch inference
- post training
- reward model scoring
- synthetic data generation
Running FP8
You need fbgemm-gpu
package which requires torch >= 2.4.0 (currently only in nightly, but releasing shortly...).
ENV=fp8_env
conda create -n $ENV python=3.10
conda activate $ENV
pip3 install -r fp8_requirements.txt
Generate OpenAPI specs
Set up virtual environment
python3 -m venv ~/.venv/toolchain/
source ~/.venv/toolchain/bin/activate
with-proxy pip3 install -r requirements.txt
Run the generate.sh script
cd source && sh generate.sh