llama-stack

forked from phoenix-oss/llama-stack-mirror

History

yyymeta a626b7bce3 feat: [new open benchmark] BFCL_v3 (#1578 ) # What does this PR do? create a new dataset BFCL_v3 from https://gorilla.cs.berkeley.edu/blogs/13_bfcl_v3_multi_turn.html overall each question asks the model to perform a task described in natural language, and additionally a set of available functions and their schema are given for the model to choose from. the model is required to write the function call form including function name and parameters , to achieve the stated purpose. the results are validated against provided ground truth, to make sure that the generated function call and the ground truth function call are syntactically and semantically equivalent, by checking their AST . ## Test Plan start server by ``` llama stack run ./llama_stack/templates/ollama/run.yaml ``` then send traffic ``` llama-stack-client eval run-benchmark "bfcl" --model-id meta-llama/Llama-3.2-3B-Instruct --output-dir /tmp/gpqa --num-examples 2 ``` [//]: # (## Documentation)		2025-03-14 12:50:49 -07:00
..
bedrock	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
cerebras	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
ci-tests	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
dell	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
dev	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
experimental-post-training	feat: [post training] support save hf safetensor format checkpoint (#845 )	2025-02-25 23:29:08 -08:00
fireworks	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
groq	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
hf-endpoint	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
hf-serverless	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
meta-reference-gpu	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
meta-reference-quantized-gpu	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
nvidia	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
ollama	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
open-benchmark	feat: [new open benchmark] BFCL_v3 (#1578 )	2025-03-14 12:50:49 -07:00
passthrough	fix: passthrough provider template + fix (#1612 )	2025-03-13 09:44:26 -07:00
remote-vllm	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
sambanova	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
tgi	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
together	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
vllm-gpu	test: add unit test to ensure all config types are instantiable (#1601 )	2025-03-12 22:29:58 -07:00
__init__.py	Auto-generate distro yamls + docs (#468 )	2024-11-18 14:57:06 -08:00
template.py	fix: fix precommit (#1594 )	2025-03-12 11:59:21 -07:00