llama-stack/llama_stack/models/llama/llama4
ehhuang 29072f40ab
feat: new system prompt for llama4 (#2031)
Tests:

LLAMA_STACK_CONFIG=http://localhost:5002 pytest -s -v
tests/integration/inference --safety-shield meta-llama/Llama-Guard-3-8B
--vision-model meta-llama/Llama-4-Scout-17B-16E-Instruct --text-model
meta-llama/Llama-4-Scout-17B-16E-Instruct

Co-authored-by: Eric Huang <erichuang@fb.com>
2025-04-25 11:29:08 -07:00
..
prompt_templates feat: new system prompt for llama4 (#2031) 2025-04-25 11:29:08 -07:00
quantization fix: on-the-fly int4 quantize parameter (#1920) 2025-04-09 15:00:12 -07:00
vision refactor: move all llama code to models/llama out of meta reference (#1887) 2025-04-07 15:03:58 -07:00
__init__.py feat: introduce llama4 support (#1877) 2025-04-05 11:53:35 -07:00
args.py fix: Mirror llama4 rope scaling fixes, small model simplify (#1917) 2025-04-09 11:28:45 -07:00
chat_format.py fix: OAI compat endpoint for meta reference inference provider (#1962) 2025-04-17 11:16:04 -07:00
datatypes.py refactor: move all llama code to models/llama out of meta reference (#1887) 2025-04-07 15:03:58 -07:00
ffn.py refactor: move all llama code to models/llama out of meta reference (#1887) 2025-04-07 15:03:58 -07:00
generation.py feat: add batch inference API to llama stack inference (#1945) 2025-04-12 11:41:12 -07:00
model.py fix: Mirror llama4 rope scaling fixes, small model simplify (#1917) 2025-04-09 11:28:45 -07:00
moe.py refactor: move all llama code to models/llama out of meta reference (#1887) 2025-04-07 15:03:58 -07:00
preprocess.py refactor: move all llama code to models/llama out of meta reference (#1887) 2025-04-07 15:03:58 -07:00
prompt_format.md feat: introduce llama4 support (#1877) 2025-04-05 11:53:35 -07:00
prompts.py refactor: move all llama code to models/llama out of meta reference (#1887) 2025-04-07 15:03:58 -07:00
tokenizer.model feat: introduce llama4 support (#1877) 2025-04-05 11:53:35 -07:00
tokenizer.py fix: unhide python_start, python_end 2025-04-11 20:30:44 -07:00