forked from phoenix-oss/llama-stack-mirror

Composable building blocks to build Llama Apps

Find a file

Hardik Shah 156bfa0e15 Added Ollama as an inference impl (#20 ) * fix non-streaming api in inference server * unit test for inline inference * Added non-streaming ollama inference impl * add streaming support for ollama inference with tests * addressing comments --------- Co-authored-by: Hardik Shah <hjshah@fb.com>		2024-07-31 22:08:37 -07:00
docs	model template --template -> model template --name	2024-07-29 18:21:05 -07:00
llama_toolchain	Added Ollama as an inference impl (#20 )	2024-07-31 22:08:37 -07:00
tests	Added Ollama as an inference impl (#20 )	2024-07-31 22:08:37 -07:00
.flake8	Initial commit	2024-07-23 08:32:33 -07:00
.gitignore	Initial commit	2024-07-23 08:32:33 -07:00
.pre-commit-config.yaml	Initial commit	2024-07-23 08:32:33 -07:00
CODE_OF_CONDUCT.md	Initial commit	2024-07-23 08:32:33 -07:00
CONTRIBUTING.md	Initial commit	2024-07-23 08:32:33 -07:00
fp8_requirements.txt	Update fp8_requirements, we don't need nightly torch anymore	2024-07-26 08:25:44 -07:00
LICENSE	Initial commit	2024-07-23 08:32:33 -07:00
MANIFEST.in	add requirements to MANIFEST.in	2024-07-23 12:59:28 -07:00
pyproject.toml	Initial commit	2024-07-23 08:32:33 -07:00
README.md	Add links to shields	2024-07-27 11:28:46 -04:00
requirements.txt	Added Ollama as an inference impl (#20 )	2024-07-31 22:08:37 -07:00
setup.py	Bump version to 0.0.2	2024-07-29 23:56:41 -07:00

README.md

llama-toolchain

This repo contains the API specifications for various components of the Llama Stack as well implementations for some of those APIs like model inference.

The Llama Stack consists of toolchain-apis and agentic-apis. This repo contains the toolchain-apis.

Installation

You can install this repository as a package with pip install llama-toolchain

If you want to install from source:

mkdir -p ~/local
cd ~/local
git clone git@github.com:meta-llama/llama-toolchain.git

conda create -n toolchain python=3.10
conda activate toolchain

cd llama-toolchain
pip install -e .

The Llama CLI

The llama CLI makes it easy to configure and run the Llama toolchain. Read the CLI reference for details.

Appendix: Running FP8

If you want to run FP8, you need the fbgemm-gpu package which requires torch >= 2.4.0 (currently only in nightly, but releasing shortly...)

ENV=fp8_env
conda create -n $ENV python=3.10
conda activate $ENV

pip3 install -r fp8_requirements.txt