Commit graph

6 commits

Author SHA1 Message Date
Ashwin Bharambe
fef679bb34 Don't load as bf16 on CPU unless fp8 is active 2024-07-22 19:09:55 -07:00
Ashwin Bharambe
9b51b4edd8 update batch completion endpoint 2024-07-22 16:08:28 -07:00
Ashwin Bharambe
acb2a91872 Remove configurations 2024-07-22 16:03:37 -07:00
Ashwin Bharambe
bbfd8a587e add EventLogger for inference 2024-07-22 15:11:34 -07:00
Ashwin Bharambe
2e7978fa39 update import for quantization format from models 2024-07-22 00:04:03 -07:00
Hardik Shah
f9111652ef rename toolchain/ --> llama_toolchain/ 2024-07-21 23:48:38 -07:00