llama-stack

forked from phoenix-oss/llama-stack-mirror

History

ehhuang 0266b20535 docs: update prompt_format.md for llama4 (#2035 ) torchrun --nproc_per_node=8 scripts/generate_prompt_format.py meta-llama/Llama-4-Scout-17B-16E-Instruct ~/local/checkpoints/<path>/ llama_stack.models.llama.llama4.prompts llama_stack/models/llama/llama4/prompt_format.md Co-authored-by: Eric Huang <erichuang@fb.com>		2025-04-25 15:52:15 -07:00
..
prompt_templates	feat: new system prompt for llama4 (#2031 )	2025-04-25 11:29:08 -07:00
quantization	fix: on-the-fly int4 quantize parameter (#1920 )	2025-04-09 15:00:12 -07:00
vision	refactor: move all llama code to models/llama out of meta reference (#1887 )	2025-04-07 15:03:58 -07:00
__init__.py	feat: introduce llama4 support (#1877 )	2025-04-05 11:53:35 -07:00
args.py	fix: Mirror llama4 rope scaling fixes, small model simplify (#1917 )	2025-04-09 11:28:45 -07:00
chat_format.py	fix: tool call encoded twice (#2034 )	2025-04-25 13:16:16 -07:00
datatypes.py	refactor: move all llama code to models/llama out of meta reference (#1887 )	2025-04-07 15:03:58 -07:00
ffn.py	refactor: move all llama code to models/llama out of meta reference (#1887 )	2025-04-07 15:03:58 -07:00
generation.py	feat: add batch inference API to llama stack inference (#1945 )	2025-04-12 11:41:12 -07:00
model.py	fix: Mirror llama4 rope scaling fixes, small model simplify (#1917 )	2025-04-09 11:28:45 -07:00
moe.py	refactor: move all llama code to models/llama out of meta reference (#1887 )	2025-04-07 15:03:58 -07:00
preprocess.py	refactor: move all llama code to models/llama out of meta reference (#1887 )	2025-04-07 15:03:58 -07:00
prompt_format.md	docs: update prompt_format.md for llama4 (#2035 )	2025-04-25 15:52:15 -07:00
prompts.py	docs: update prompt_format.md for llama4 (#2035 )	2025-04-25 15:52:15 -07:00
tokenizer.model	feat: introduce llama4 support (#1877 )	2025-04-05 11:53:35 -07:00
tokenizer.py	fix: unhide python_start, python_end	2025-04-11 20:30:44 -07:00