llama-stack-mirror

mirror of https://github.com/meta-llama/llama-stack.git synced 2025-12-03 09:53:45 +00:00

History

Xi Yan 41487e6ed1 refactor scoring/eval pytests (#607 ) # What does this PR do? - remove model registration & parameterize model in scoring/eval pytests ## Test Plan ``` pytest -v -s -m meta_reference_eval_together_inference eval/test_eval.py pytest -v -s -m meta_reference_eval_together_inference_huggingface_datasetio eval/test_eval.py ``` ``` pytest -v -s -m llm_as_judge_scoring_together_inference scoring/test_scoring.py --judge-model meta-llama/Llama-3.2-3B-Instruct pytest -v -s -m basic_scoring_together_inference scoring/test_scoring.py ``` <img width="860" alt="image" src="https://github.com/user-attachments/assets/d4b0badc-da34-4097-9b7c-9511f8261723" /> ## Sources Please link relevant resources if necessary. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Ran pre-commit to handle lint / formatting issues. - [ ] Read the [contributor guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md), Pull Request section? - [ ] Updated relevant documentation. - [ ] Wrote necessary unit or integration tests.		2024-12-11 10:47:37 -08:00
..
inline	[/scoring] add ability to define aggregation functions for scoring functions & refactors (#597 )	2024-12-11 10:03:42 -08:00
registry	Add ability to query and export spans to dataset (#574 )	2024-12-05 21:07:30 -08:00
remote	add completion api support to nvidia inference provider (#533 )	2024-12-11 10:08:38 -08:00
tests	refactor scoring/eval pytests (#607 )	2024-12-11 10:47:37 -08:00
utils	Revert "add model type to APIs" (#605 )	2024-12-11 10:17:54 -08:00
__init__.py	API Updates (#73 )	2024-09-17 19:51:35 -07:00
datatypes.py	unregister API for dataset (#507 )	2024-12-03 21:18:30 -08:00