Ashwin Bharambe
|
d73fed5cc3
|
cleanup for fp8 and requirements etc
|
2024-07-20 23:21:55 -07:00 |
|
Hardik Shah
|
2428701951
|
download inside model_name directory
|
2024-07-20 23:16:19 -07:00 |
|
Ashwin Bharambe
|
0746a0f62b
|
fp8 inference
|
2024-07-20 23:13:47 -07:00 |
|
Ashwin Bharambe
|
ad62e2e1f3
|
make inference server load checkpoints for fp8 inference
- introduce quantization related args for inference config
- also kill GeneratorArgs
|
2024-07-20 22:54:48 -07:00 |
|
Ashwin Bharambe
|
7d2c0b14b8
|
Changes from the main repo
|
2024-07-20 22:52:29 -07:00 |
|
Hardik Shah
|
9c9b834c0f
|
update prompt-shield to reflect latest changes in agentic
|
2024-07-19 18:12:09 -07:00 |
|
Hardik Shah
|
2ed2881a21
|
fixed imports models.llama3. --> models.llama3_1.api.
|
2024-07-19 17:42:14 -07:00 |
|
Ashwin Bharambe
|
95781ec85d
|
Add toolchain from agentic system here
|
2024-07-19 12:30:35 -07:00 |
|