Commit graph

21 commits

Author SHA1 Message Date
Ashwin Bharambe
0e2fc9966a Reduce loading time for non-fp8 2024-07-22 19:21:04 -07:00
Ashwin Bharambe
fef679bb34 Don't load as bf16 on CPU unless fp8 is active 2024-07-22 19:09:55 -07:00
Kate Plawiak
5228bdc0f3
Revert "Update llama guard file to latest version" 2024-07-22 17:27:19 -07:00
Kate Plawiak
dfe0173b58
Merge pull request #1 from meta-llama/fix_llama_guard
Update llama guard file to latest version
2024-07-22 16:27:47 -07:00
Ashwin Bharambe
9b51b4edd8 update batch completion endpoint 2024-07-22 16:08:28 -07:00
Ashwin Bharambe
acb2a91872 Remove configurations 2024-07-22 16:03:37 -07:00
Ashwin Bharambe
bbfd8a587e add EventLogger for inference 2024-07-22 15:11:34 -07:00
Hardik Shah
7574ffb25f added __init__ 2024-07-22 14:49:26 -07:00
Hardik Shah
441e5da6ed no special casign for original 2024-07-22 14:42:38 -07:00
Hardik Shah
4d3b226275 check original folder 2024-07-22 14:35:09 -07:00
Kate Plawiak
91b43600f7 increase max_new_tokens 2024-07-22 13:58:51 -07:00
Kate Plawiak
cb5829901f redo and fix only specific lines 2024-07-22 13:46:43 -07:00
Kate Plawiak
d5019cf3b3 update llama guard file to latest version 2024-07-22 13:36:11 -07:00
Hardik Shah
74442e88b1 add yaml to manifest 2024-07-22 13:34:08 -07:00
Hardik Shah
6f0d348b1c add init for common 2024-07-22 11:51:10 -07:00
Ashwin Bharambe
c38d638340 sku -> family 2024-07-22 11:15:04 -07:00
Ashwin Bharambe
f0e0903270 add llama model subcommand 2024-07-22 11:07:11 -07:00
Hardik Shah
4417407652 agentic_system --> llama_agentic_system 2024-07-22 01:20:32 -07:00
Hardik Shah
1eac470045 add __init__ 2024-07-22 01:17:54 -07:00
Ashwin Bharambe
2e7978fa39 update import for quantization format from models 2024-07-22 00:04:03 -07:00
Hardik Shah
f9111652ef rename toolchain/ --> llama_toolchain/ 2024-07-21 23:48:38 -07:00