llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 18:00:36 +00:00

History

Sachin Mehta c05fbf14b3 Added hadamard transform for spinquant (#326 ) * Added hadamard transform for spinquant * Changed from config to model_args * Added an assertion for model args * Use enum.value to check against str * pre-commit --------- Co-authored-by: Sachin Mehta <sacmehta@fb.com> Co-authored-by: Ashwin Bharambe <ashwin.bharambe@gmail.com>		2024-10-25 12:58:48 -07:00
..
quantization	Added hadamard transform for spinquant (#326 )	2024-10-25 12:58:48 -07:00
__init__.py	Split off meta-reference-quantized provider	2024-10-10 16:03:19 -07:00
config.py	Allow overridding checkpoint_dir via config	2024-10-18 14:28:06 -07:00
generation.py	Added hadamard transform for spinquant (#326 )	2024-10-25 12:58:48 -07:00
inference.py	Add support for Structured Output / Guided decoding (#281 )	2024-10-22 12:53:34 -07:00
model_parallel.py	Make all methods `async def` again; add completion() for meta-reference (#270 )	2024-10-18 20:50:59 -07:00
parallel_utils.py	Make all methods `async def` again; add completion() for meta-reference (#270 )	2024-10-18 20:50:59 -07:00