llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 18:00:36 +00:00

Author	SHA1	Message	Date
Hardik Shah	c64b8cba22	from models.llama3_1 --> from llama_models.llama3_1	2024-07-21 19:07:02 -07:00
rsm	67f0510edd	rename ModelInference to Inference	2024-07-21 12:20:32 -07:00
Hardik Shah	c9f33d8f68	cli updates	2024-07-21 01:51:54 -07:00
Ashwin Bharambe	d73fed5cc3	cleanup for fp8 and requirements etc	2024-07-20 23:21:55 -07:00
Ashwin Bharambe	0746a0f62b	fp8 inference	2024-07-20 23:13:47 -07:00
Ashwin Bharambe	ad62e2e1f3	make inference server load checkpoints for fp8 inference - introduce quantization related args for inference config - also kill GeneratorArgs	2024-07-20 22:54:48 -07:00
Ashwin Bharambe	7d2c0b14b8	Changes from the main repo	2024-07-20 22:52:29 -07:00
Hardik Shah	2ed2881a21	fixed imports models.llama3. --> models.llama3_1.api.	2024-07-19 17:42:14 -07:00
Ashwin Bharambe	95781ec85d	Add toolchain from agentic system here	2024-07-19 12:30:35 -07:00