From dd990614f5d6bf106677dd1921a25c9f34c9ad0a Mon Sep 17 00:00:00 2001 From: Ikko Eltociear Ashimine Date: Sat, 21 Dec 2024 13:35:19 +0900 Subject: [PATCH] docs: update evals_reference/index.md minor fix --- docs/source/references/evals_reference/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/source/references/evals_reference/index.md b/docs/source/references/evals_reference/index.md index 9ba4f2848..f93b56e64 100644 --- a/docs/source/references/evals_reference/index.md +++ b/docs/source/references/evals_reference/index.md @@ -47,7 +47,7 @@ This first example walks you through how to evaluate a model candidate served by - [SimpleQA](https://openai.com/index/introducing-simpleqa/): Benchmark designed to access models to answer short, fact-seeking questions. #### 1.1 Running MMMU -- We will use a pre-processed MMMU dataset from [llamastack/mmmu](https://huggingface.co/datasets/llamastack/mmmu). The preprocessing code is shown in in this [Github Gist](https://gist.github.com/yanxi0830/118e9c560227d27132a7fd10e2c92840). The dataset is obtained by transforming the original [MMMU/MMMU](https://huggingface.co/datasets/MMMU/MMMU) dataset into correct format by `inference/chat-completion` API. +- We will use a pre-processed MMMU dataset from [llamastack/mmmu](https://huggingface.co/datasets/llamastack/mmmu). The preprocessing code is shown in this [GitHub Gist](https://gist.github.com/yanxi0830/118e9c560227d27132a7fd10e2c92840). The dataset is obtained by transforming the original [MMMU/MMMU](https://huggingface.co/datasets/MMMU/MMMU) dataset into correct format by `inference/chat-completion` API. ```python import datasets