| 
								
								
									 Xi Yan | a4bcfb8bba | [/scoring] add ability to define aggregation functions for scoring functions & refactors (#597) # What does this PR do?
- Add ability to define aggregation functions for scoring functions via
`ScoringFnParams`
- Supported by `basic` / `regex_parser` / `llm_as_judge` scoring
functions
## Test Plan
```
pytest -v -s -m basic_scoring_together_inference scoring/test_scoring.py
```
<img width="855" alt="image"
src="https://github.com/user-attachments/assets/12db8e6e-2ad4-462e-b9b9-70ba6c050a6c">
```
pytest -v -s -m llm_as_judge_scoring_together_inference scoring/test_scoring.py
```
<img width="858" alt="image"
src="https://github.com/user-attachments/assets/bf806676-6f5e-456d-be9f-f81a26d1df19">
**Example Response** (`basic`)
<img width="863" alt="image"
src="https://github.com/user-attachments/assets/0e57a49c-8386-45cc-8fa9-3e61aaa9a3be">
**Example Response** (`llm-as-judge`)
<img width="854" alt="image"
src="https://github.com/user-attachments/assets/38065bc2-b724-47ed-9535-79b6099c4362">
## Sources
Please link relevant resources if necessary.
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
      Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests. | 2024-12-11 10:03:42 -08:00 |  | 
				
					
						| 
								
								
									 Xi Yan | 60cb7f64af | add missing __init__ | 2024-11-25 09:42:46 -08:00 |  | 
				
					
						| 
								
								
									 Xi Yan | 84c6fbbd93 | fix tests after registration migration & rename meta-reference -> basic / llm_as_judge provider (#424) * rename meta-reference -> basic
* config rename
* impl rename
* rename llm_as_judge, fix test
* util
* rebase
* naming fix | 2024-11-12 10:35:44 -05:00 |  | 
				
					
						| 
								
								
									 Xi Yan | b4416b72fd | Folder restructure for evals/datasets/scoring (#419) * rename evals related stuff
* fix datasetio
* fix scoring test
* localfs -> LocalFS
* refactor scoring
* refactor scoring
* remove 8b_correctness scoring_fn from tests
* tests w/ eval params
* scoring fn braintrust fixture
* import | 2024-11-11 17:35:40 -05:00 |  |