Commit graph

103 commits

Author SHA1 Message Date
Hardik Shah
aca6bfe0df drop custom classes to manage hydra 2024-07-22 20:41:03 -07:00
Ashwin Bharambe
86fff23a9e updating license for toolchain 2024-07-22 20:31:42 -07:00
Ashwin Bharambe
0e2fc9966a Reduce loading time for non-fp8 2024-07-22 19:21:04 -07:00
Ashwin Bharambe
fef679bb34 Don't load as bf16 on CPU unless fp8 is active 2024-07-22 19:09:55 -07:00
Kate Plawiak
8cd2e4164c
Merge pull request #2 from meta-llama/revert-1-fix_llama_guard
Revert "Update llama guard file to latest version"
2024-07-22 17:28:13 -07:00
Kate Plawiak
5228bdc0f3
Revert "Update llama guard file to latest version" 2024-07-22 17:27:19 -07:00
Kate Plawiak
dfe0173b58
Merge pull request #1 from meta-llama/fix_llama_guard
Update llama guard file to latest version
2024-07-22 16:27:47 -07:00
Ashwin Bharambe
9b51b4edd8 update batch completion endpoint 2024-07-22 16:08:28 -07:00
Ashwin Bharambe
1e573843ce added pre-commit to toolchain 2024-07-22 16:04:31 -07:00
Ashwin Bharambe
acb2a91872 Remove configurations 2024-07-22 16:03:37 -07:00
Ashwin Bharambe
bbfd8a587e add EventLogger for inference 2024-07-22 15:11:34 -07:00
Hardik Shah
7574ffb25f added __init__ 2024-07-22 14:49:26 -07:00
Hardik Shah
441e5da6ed no special casign for original 2024-07-22 14:42:38 -07:00
Hardik Shah
4d3b226275 check original folder 2024-07-22 14:35:09 -07:00
Kate Plawiak
91b43600f7 increase max_new_tokens 2024-07-22 13:58:51 -07:00
Kate Plawiak
cb5829901f redo and fix only specific lines 2024-07-22 13:46:43 -07:00
Kate Plawiak
d5019cf3b3 update llama guard file to latest version 2024-07-22 13:36:11 -07:00
Hardik Shah
74442e88b1 add yaml to manifest 2024-07-22 13:34:08 -07:00
Hardik Shah
6f0d348b1c add init for common 2024-07-22 11:51:10 -07:00
Ashwin Bharambe
54a22e288a requirements 2024-07-22 11:39:42 -07:00
Ashwin Bharambe
c38d638340 sku -> family 2024-07-22 11:15:04 -07:00
Ashwin Bharambe
f0e0903270 add llama model subcommand 2024-07-22 11:07:11 -07:00
Hardik Shah
4417407652 agentic_system --> llama_agentic_system 2024-07-22 01:20:32 -07:00
Hardik Shah
1eac470045 add __init__ 2024-07-22 01:17:54 -07:00
Ashwin Bharambe
2e7978fa39 update import for quantization format from models 2024-07-22 00:04:03 -07:00
Hardik Shah
f9111652ef rename toolchain/ --> llama_toolchain/ 2024-07-21 23:48:38 -07:00
Hardik Shah
d95f5f863d use default_config file to configure inference 2024-07-21 19:26:11 -07:00
Hardik Shah
c64b8cba22 from models.llama3_1 --> from llama_models.llama3_1 2024-07-21 19:07:02 -07:00
Hardik Shah
c6ef16f6bd consol_scripts for toolchain 2024-07-21 17:39:47 -07:00
rsm
7c69675b79 added pypi package 2024-07-21 13:43:36 -07:00
Hardik Shah
b0f3406a08 deleting bash script as this is not done via cli 2024-07-21 12:55:49 -07:00
Hardik Shah
6bcd826b32 enable import of subcommands from llama-agentic-system 2024-07-21 12:54:48 -07:00
rsm
67f0510edd rename ModelInference to Inference 2024-07-21 12:20:32 -07:00
Ashwin Bharambe
245461620d make sure scripts always have pipefail 2024-07-21 12:18:49 -07:00
Hardik Shah
c9f33d8f68 cli updates 2024-07-21 01:51:54 -07:00
Hardik Shah
23fe353e4a cli -- llama inference configure 2024-07-21 01:17:15 -07:00
Ashwin Bharambe
0df57c4447 fix bad merge with injection shield? 2024-07-20 23:54:44 -07:00
Hardik Shah
2408bd81c8 easy script to create config 2024-07-20 23:51:46 -07:00
Ashwin Bharambe
7c9ed3e58e update README a bit 2024-07-20 23:26:50 -07:00
Ashwin Bharambe
d73fed5cc3 cleanup for fp8 and requirements etc 2024-07-20 23:21:55 -07:00
Hardik Shah
2428701951 download inside model_name directory 2024-07-20 23:16:19 -07:00
Ashwin Bharambe
0746a0f62b fp8 inference 2024-07-20 23:13:47 -07:00
Ashwin Bharambe
ad62e2e1f3 make inference server load checkpoints for fp8 inference
- introduce quantization related args for inference config
- also kill GeneratorArgs
2024-07-20 22:54:48 -07:00
Ashwin Bharambe
7d2c0b14b8 Changes from the main repo 2024-07-20 22:52:29 -07:00
Hardik Shah
9c9b834c0f update prompt-shield to reflect latest changes in agentic 2024-07-19 18:12:09 -07:00
Hardik Shah
ce0804556b update requirements for running standalone 2024-07-19 18:11:25 -07:00
Hardik Shah
2ed2881a21 fixed imports models.llama3. --> models.llama3_1.api. 2024-07-19 17:42:14 -07:00
Ashwin Bharambe
f94efcf2ee kill older junk 2024-07-19 12:32:22 -07:00
Ashwin Bharambe
95781ec85d Add toolchain from agentic system here 2024-07-19 12:30:35 -07:00
Ashwin Bharambe
f6b2b2fb39 cleanup 2024-07-11 10:04:56 -07:00