llama-stack/llama_stack/providers/impls/meta_reference/inference/quantization
Sachin Mehta c05fbf14b3
Added hadamard transform for spinquant (#326)
* Added hadamard transform for spinquant

* Changed from config to model_args

* Added an assertion for model args

* Use enum.value to check against str

* pre-commit

---------

Co-authored-by: Sachin Mehta <sacmehta@fb.com>
Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>
2024-10-25 12:58:48 -07:00
..
scripts fix broken --list-templates with adding build.yaml files for packaging (#327) 2024-10-25 12:51:22 -07:00
__init__.py API Updates (#73) 2024-09-17 19:51:35 -07:00
fp8_impls.py API Updates (#73) 2024-09-17 19:51:35 -07:00
fp8_txest_disabled.py Add a test runner and 2 very simple tests for agents 2024-09-19 12:22:48 -07:00
hadamard_utils.py Added hadamard transform for spinquant (#326) 2024-10-25 12:58:48 -07:00
loader.py Added hadamard transform for spinquant (#326) 2024-10-25 12:58:48 -07:00