llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-04 02:03:44 +00:00

History

Botao Chen f369871083 feat: [New Eval Benchamark] IfEval (#1708 ) # What does this PR do? In this PR, we added a new eval open benchmark IfEval based on paper https://arxiv.org/abs/2311.07911 to measure the model capability of instruction following. ## Test Plan spin up a llama stack server with open-benchmark template run `llama-stack-client --endpoint xxx eval run-benchmark "meta-reference-ifeval" --model-id "meta-llama/Llama-3.3-70B-Instruct" --output-dir "/home/markchen1015/" --num-examples 20` on client side and get the eval aggregate results		2025-03-19 16:39:59 -07:00
..
css	Several documentation fixes and fix link to API reference	2025-02-04 14:00:43 -08:00
llama-stack-logo.png	first version of readthedocs (#278 )	2024-10-22 10:15:58 +05:30
llama-stack-spec.html	feat: [New Eval Benchamark] IfEval (#1708 )	2025-03-19 16:39:59 -07:00
llama-stack-spec.yaml	feat: [New Eval Benchamark] IfEval (#1708 )	2025-03-19 16:39:59 -07:00
llama-stack.png	Make a new llama stack image	2024-11-22 23:49:22 -08:00
remote_or_local.gif	[docs] update documentations (#356 )	2024-11-04 16:52:38 -08:00
safety_system.webp	[Docs] Zero-to-Hero notebooks and quick start documentation (#368 )	2024-11-08 17:16:44 -08:00