llama-stack/llama_stack
Botao Chen b751f7003d
feat: add aggregation_functions to llm_as_judge_405b_simpleqa (#1164)
as title, to let scoring function llm_as_judge_405b_simpleqa output
aggregated_results.

We can leverage categorical_count to calculate the % of correctness as
eval benchmark metrics
2025-02-19 19:42:04 -08:00
..
apis feat: support tool_choice = {required, none, <function>} (#1059) 2025-02-18 23:25:15 -05:00
cli style: remove prints in codebase (#1146) 2025-02-18 19:41:37 -08:00
distribution feat: support tool_choice = {required, none, <function>} (#1059) 2025-02-18 23:25:15 -05:00
models/llama chore: move all Llama Stack types from llama-models to llama-stack (#1098) 2025-02-14 09:10:59 -08:00
providers feat: add aggregation_functions to llm_as_judge_405b_simpleqa (#1164) 2025-02-19 19:42:04 -08:00
scripts fix: Get distro_codegen.py working with default deps and enabled in pre-commit hooks (#1123) 2025-02-19 18:39:20 -08:00
strong_typing Ensure that deprecations for fields follow through to OpenAPI 2025-02-19 13:54:04 -08:00
templates fix: Get distro_codegen.py working with default deps and enabled in pre-commit hooks (#1123) 2025-02-19 18:39:20 -08:00
__init__.py export LibraryClient 2024-12-13 12:08:00 -08:00
schema_utils.py chore: move all Llama Stack types from llama-models to llama-stack (#1098) 2025-02-14 09:10:59 -08:00