llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 09:53:45 +00:00

History

Sachin Mehta c05fbf14b3 Added hadamard transform for spinquant (#326 ) * Added hadamard transform for spinquant * Changed from config to model_args * Added an assertion for model args * Use enum.value to check against str * pre-commit --------- Co-authored-by: Sachin Mehta <sacmehta@fb.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>		2024-10-25 12:58:48 -07:00
..
ios/inference	Update iOS inference instructions for new quantization	2024-10-24 14:47:27 -04:00
meta_reference	Added hadamard transform for spinquant (#326 )	2024-10-25 12:58:48 -07:00
vllm	Make vllm inference better	2024-10-24 22:52:47 -07:00
__init__.py	API Updates (#73 )	2024-09-17 19:51:35 -07:00