Hardik Shah
|
c9f33d8f68
|
cli updates
|
2024-07-21 01:51:54 -07:00 |
|
Hardik Shah
|
23fe353e4a
|
cli -- llama inference configure
|
2024-07-21 01:17:15 -07:00 |
|
Ashwin Bharambe
|
0df57c4447
|
fix bad merge with injection shield?
|
2024-07-20 23:54:44 -07:00 |
|
Ashwin Bharambe
|
d73fed5cc3
|
cleanup for fp8 and requirements etc
|
2024-07-20 23:21:55 -07:00 |
|
Hardik Shah
|
2428701951
|
download inside model_name directory
|
2024-07-20 23:16:19 -07:00 |
|
Ashwin Bharambe
|
0746a0f62b
|
fp8 inference
|
2024-07-20 23:13:47 -07:00 |
|
Ashwin Bharambe
|
ad62e2e1f3
|
make inference server load checkpoints for fp8 inference
- introduce quantization related args for inference config
- also kill GeneratorArgs
|
2024-07-20 22:54:48 -07:00 |
|
Ashwin Bharambe
|
7d2c0b14b8
|
Changes from the main repo
|
2024-07-20 22:52:29 -07:00 |
|
Hardik Shah
|
9c9b834c0f
|
update prompt-shield to reflect latest changes in agentic
|
2024-07-19 18:12:09 -07:00 |
|
Hardik Shah
|
2ed2881a21
|
fixed imports models.llama3. --> models.llama3_1.api.
|
2024-07-19 17:42:14 -07:00 |
|
Ashwin Bharambe
|
95781ec85d
|
Add toolchain from agentic system here
|
2024-07-19 12:30:35 -07:00 |
|