forked from phoenix-oss/llama-stack-mirror
fix: test_datasets HF scenario in CI (#2090)
# What does this PR do? **Fixes** #1959 HuggingFace provides several loading paths that the datasets library can use. My theory on why the test would previously fail intermittently is because when calling `load_dataset(...)`, it may be trying several options such as local cache, Hugging Face Hub, or a dataset script, or other. There's one of these options that seem to work inconsistently in the CI. The HuggingFace datasets library relies on the `transformers` package to load certain datasets such as `llamastack/simpleqa`, and by adding the package, we can see the dataset is loaded consistently via the Hugging Face Hub. Please see PR in my fork demonstrating over 7 consecutive passes: https://github.com/ChristianZaccaria/llama-stack/pull/1 **Some References:** - https://github.com/huggingface/transformers/issues/8690 - https://huggingface.co/docs/datasets/en/loading [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan [Describe the tests you ran to verify your changes with result summaries. *Provide clear instructions so the plan can be easily re-executed.*] [//]: # (## Documentation)
This commit is contained in:
parent
2e807b38cc
commit
18d2312690
3 changed files with 71 additions and 1 deletions
|
@ -31,7 +31,6 @@ def data_url_from_file(file_path: str) -> str:
|
|||
return data_url
|
||||
|
||||
|
||||
@pytest.mark.skip(reason="flaky. Couldn't find 'llamastack/simpleqa' on the Hugging Face Hub")
|
||||
@pytest.mark.parametrize(
|
||||
"purpose, source, provider_id, limit",
|
||||
[
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue