llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-31 07:00:02 +00:00

History

Ashwin Bharambe 14ff4c647c include content in the message even if you have parsed out a tool call		2025-04-12 11:27:36 -07:00
..
quantization	fix: on-the-fly int4 quantize parameter (#1920 )	2025-04-09 15:00:12 -07:00
vision	refactor: move all llama code to models/llama out of meta reference (#1887 )	2025-04-07 15:03:58 -07:00
__init__.py	feat: introduce llama4 support (#1877 )	2025-04-05 11:53:35 -07:00
args.py	fix: Mirror llama4 rope scaling fixes, small model simplify (#1917 )	2025-04-09 11:28:45 -07:00
chat_format.py	include content in the message even if you have parsed out a tool call	2025-04-12 11:27:36 -07:00
datatypes.py	refactor: move all llama code to models/llama out of meta reference (#1887 )	2025-04-07 15:03:58 -07:00
ffn.py	refactor: move all llama code to models/llama out of meta reference (#1887 )	2025-04-07 15:03:58 -07:00
generation.py	feat: add batch inference API to llama stack inference	2025-04-11 16:18:19 -07:00
model.py	fix: Mirror llama4 rope scaling fixes, small model simplify (#1917 )	2025-04-09 11:28:45 -07:00
moe.py	refactor: move all llama code to models/llama out of meta reference (#1887 )	2025-04-07 15:03:58 -07:00
preprocess.py	refactor: move all llama code to models/llama out of meta reference (#1887 )	2025-04-07 15:03:58 -07:00
prompt_format.md	feat: introduce llama4 support (#1877 )	2025-04-05 11:53:35 -07:00
prompts.py	refactor: move all llama code to models/llama out of meta reference (#1887 )	2025-04-07 15:03:58 -07:00
tokenizer.model	feat: introduce llama4 support (#1877 )	2025-04-05 11:53:35 -07:00
tokenizer.py	fix: meta reference + llama4 tokenizer fix	2025-04-09 00:46:32 -07:00