llama-stack-mirror/llama_stack/models/llama/llama4
jiawenliu64 c8a0b110c0 fix: on-the-fly int4 quantize parameter
Before this PR:
```
[rank1]: TypeError: quantize_int4() got multiple values for argument 'output_device'
```
2025-04-09 13:35:11 -07:00
..
quantization fix: on-the-fly int4 quantize parameter 2025-04-09 13:35:11 -07:00
vision refactor: move all llama code to models/llama out of meta reference (#1887) 2025-04-07 15:03:58 -07:00
__init__.py feat: introduce llama4 support (#1877) 2025-04-05 11:53:35 -07:00
args.py fix: Mirror llama4 rope scaling fixes, small model simplify (#1917) 2025-04-09 11:28:45 -07:00
chat_format.py refactor: move all llama code to models/llama out of meta reference (#1887) 2025-04-07 15:03:58 -07:00
datatypes.py refactor: move all llama code to models/llama out of meta reference (#1887) 2025-04-07 15:03:58 -07:00
ffn.py refactor: move all llama code to models/llama out of meta reference (#1887) 2025-04-07 15:03:58 -07:00
generation.py refactor: move all llama code to models/llama out of meta reference (#1887) 2025-04-07 15:03:58 -07:00
model.py fix: Mirror llama4 rope scaling fixes, small model simplify (#1917) 2025-04-09 11:28:45 -07:00
moe.py refactor: move all llama code to models/llama out of meta reference (#1887) 2025-04-07 15:03:58 -07:00
preprocess.py refactor: move all llama code to models/llama out of meta reference (#1887) 2025-04-07 15:03:58 -07:00
prompt_format.md feat: introduce llama4 support (#1877) 2025-04-05 11:53:35 -07:00
prompts.py refactor: move all llama code to models/llama out of meta reference (#1887) 2025-04-07 15:03:58 -07:00
tokenizer.model feat: introduce llama4 support (#1877) 2025-04-05 11:53:35 -07:00
tokenizer.py fix: meta reference + llama4 tokenizer fix 2025-04-09 00:46:32 -07:00