llama-stack-mirror/llama_stack/providers/impls
Sachin Mehta c05fbf14b3
Added hadamard transform for spinquant (#326)
* Added hadamard transform for spinquant

* Changed from config to model_args

* Added an assertion for model args

* Use enum.value to check against str

* pre-commit

---------

Co-authored-by: Sachin Mehta <sacmehta@fb.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2024-10-25 12:58:48 -07:00
..
ios/inference Update iOS inference instructions for new quantization 2024-10-24 14:47:27 -04:00
meta_reference Added hadamard transform for spinquant (#326) 2024-10-25 12:58:48 -07:00
vllm Make vllm inference better 2024-10-24 22:52:47 -07:00
__init__.py API Updates (#73) 2024-09-17 19:51:35 -07:00