| 
								
								
									 Xi Yan | 84c6fbbd93 | fix tests after registration migration & rename meta-reference -> basic / llm_as_judge provider (#424) * rename meta-reference -> basic
* config rename
* impl rename
* rename llm_as_judge, fix test
* util
* rebase
* naming fix | 2024-11-12 10:35:44 -05:00 |  | 
				
					
						| 
								
								
									 Ashwin Bharambe | 3d7561e55c | Rename all inline providers with an inline:: prefix (#423) | 2024-11-11 22:19:16 -08:00 |  | 
				
					
						| 
								
								
									 Xi Yan | b4416b72fd | Folder restructure for evals/datasets/scoring (#419) * rename evals related stuff
* fix datasetio
* fix scoring test
* localfs -> LocalFS
* refactor scoring
* refactor scoring
* remove 8b_correctness scoring_fn from tests
* tests w/ eval params
* scoring fn braintrust fixture
* import | 2024-11-11 17:35:40 -05:00 |  | 
				
					
						| 
								
								
									 Ashwin Bharambe | 994732e2e0 | impls->inline,adapters->remote(#381) | 2024-11-06 14:54:05 -08:00 |  | 
				
					
						| 
								
								
									 Xi Yan | ed833bb758 | [Evals API][7/n] braintrust scoring provider (#333) * wip scoring refactor
* llm as judge, move folders
* test full generation + eval
* extract score regex to llm context
* remove prints, cleanup braintrust in this branch
* braintrust skeleton
* datasetio test fix
* braintrust provider
* remove prints
* dependencies
* change json -> class
* json -> class
* remove initialize
* address nits
* check identifier prefix
* braintrust scoring identifier check, rebase
* udpate MANIFEST
* manifest
* remove braintrust scoring_fn
* remove comments
* tests
* imports fix | 2024-10-28 18:59:35 -07:00 |  | 
				
					
						| 
								
								
									 Xi Yan | 7b8748c53e | [Evals API][6/n] meta-reference llm as judge, registration for ScoringFnDefs (#330) * wip scoring refactor
* llm as judge, move folders
* test full generation + eval
* extract score regex to llm context
* remove prints, cleanup braintrust in this branch
* change json -> class
* remove initialize
* address nits
* check identifier prefix
* udpate MANIFEST | 2024-10-28 14:08:42 -07:00 |  | 
				
					
						| 
								
								
									 Xi Yan | cb84034567 | [Evals API][3/n] scoring_functions / scoring meta-reference implementations (#296) * wip
* dataset validation
* test_scoring
* cleanup
* clean up test
* comments
* error checking
* dataset client
* test client:
* datasetio client
* clean up
* basic scoring function works
* scorer wip
* equality scorer
* score batch impl
* score batch
* update scoring test
* refactor
* validate scorer input
* address comments
* add all rows scores to ScoringResult
* bugfix
* scoring function def rename | 2024-10-24 14:52:30 -07:00 |  |