llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 09:53:45 +00:00

Author	SHA1	Message	Date
Ashwin Bharambe	0e2fc9966a	Reduce loading time for non-fp8	2024-07-22 19:21:04 -07:00
Ashwin Bharambe	fef679bb34	Don't load as bf16 on CPU unless fp8 is active	2024-07-22 19:09:55 -07:00
Kate Plawiak	5228bdc0f3	Revert "Update llama guard file to latest version"	2024-07-22 17:27:19 -07:00
Kate Plawiak	dfe0173b58	Merge pull request #1 from meta-llama/fix_llama_guard Update llama guard file to latest version	2024-07-22 16:27:47 -07:00
Ashwin Bharambe	9b51b4edd8	update batch completion endpoint	2024-07-22 16:08:28 -07:00
Ashwin Bharambe	acb2a91872	Remove configurations	2024-07-22 16:03:37 -07:00
Ashwin Bharambe	bbfd8a587e	add EventLogger for inference	2024-07-22 15:11:34 -07:00
Hardik Shah	7574ffb25f	added __init__	2024-07-22 14:49:26 -07:00
Hardik Shah	441e5da6ed	no special casign for original	2024-07-22 14:42:38 -07:00
Hardik Shah	4d3b226275	check original folder	2024-07-22 14:35:09 -07:00
Kate Plawiak	91b43600f7	increase max_new_tokens	2024-07-22 13:58:51 -07:00
Kate Plawiak	cb5829901f	redo and fix only specific lines	2024-07-22 13:46:43 -07:00
Kate Plawiak	d5019cf3b3	update llama guard file to latest version	2024-07-22 13:36:11 -07:00
Hardik Shah	74442e88b1	add yaml to manifest	2024-07-22 13:34:08 -07:00
Hardik Shah	6f0d348b1c	add init for common	2024-07-22 11:51:10 -07:00
Ashwin Bharambe	c38d638340	sku -> family	2024-07-22 11:15:04 -07:00
Ashwin Bharambe	f0e0903270	add llama model subcommand	2024-07-22 11:07:11 -07:00
Hardik Shah	4417407652	agentic_system --> llama_agentic_system	2024-07-22 01:20:32 -07:00
Hardik Shah	1eac470045	add __init__	2024-07-22 01:17:54 -07:00
Ashwin Bharambe	2e7978fa39	update import for quantization format from models	2024-07-22 00:04:03 -07:00
Hardik Shah	f9111652ef	rename toolchain/ --> llama_toolchain/	2024-07-21 23:48:38 -07:00

21 commits