forked from phoenix-oss/llama-stack-mirror
# What does this PR do? - add ability to register a llm-as-judge scoring function with custom judge prompts / params. - Closes https://github.com/meta-llama/llama-stack/issues/1395 [//]: # (If resolving an issue, uncomment and update the line below) [//]: # (Closes #[issue-number]) ## Test Plan **Via CLI** ``` llama-stack-client scoring_functions register \ --scoring-fn-id "llm-as-judge::my-prompt" \ --description "my custom judge" \ --return-type '{"type": "string"}' \ --provider-id "llm-as-judge" \ --provider-scoring-fn-id "my-prompt" \ --params '{"type": "llm_as_judge", "judge_model": "meta-llama/Llama-3.2-3B-Instruct", "prompt_template": "always output 1.0"}' ``` <img width="1373" alt="image" src="https://github.com/user-attachments/assets/7c6fc0ae-64fe-4581-8927-a9d8d746bd72" /> - Unit test will be addressed with https://github.com/meta-llama/llama-stack/issues/1396 [//]: # (## Documentation) |
||
|---|---|---|
| .. | ||
| apis | ||
| cli | ||
| distribution | ||
| models/llama | ||
| providers | ||
| scripts | ||
| strong_typing | ||
| templates | ||
| __init__.py | ||
| env.py | ||
| logcat.py | ||
| schema_utils.py | ||