Commit graph

3 commits

Author SHA1 Message Date
Ashwin Bharambe
d73fed5cc3 cleanup for fp8 and requirements etc 2024-07-20 23:21:55 -07:00
Ashwin Bharambe
0746a0f62b fp8 inference 2024-07-20 23:13:47 -07:00
Ashwin Bharambe
ad62e2e1f3 make inference server load checkpoints for fp8 inference
- introduce quantization related args for inference config
- also kill GeneratorArgs
2024-07-20 22:54:48 -07:00