forked from phoenix-oss/llama-stack-mirror
# What does this PR do?
- braintrust scoring provider requires OPENAI_API_KEY env variable to be
set
- move this to be able to be set as request headers (e.g. like together
/ fireworks api keys)
- fixes pytest with agents dependency
## Test Plan
**E2E**
```
llama stack run
```
```yaml
scoring:
- provider_id: braintrust-0
provider_type: inline::braintrust
config: {}
```
**Client**
```python
self.client = LlamaStackClient(
base_url=os.environ.get("LLAMA_STACK_ENDPOINT", "http://localhost:5000"),
provider_data={
"openai_api_key": os.environ.get("OPENAI_API_KEY", ""),
},
)
```
- run `llama-stack-client eval run_scoring`
**Unit Test**
```
pytest -v -s -m meta_reference_eval_together_inference eval/test_eval.py
```
```
pytest -v -s -m braintrust_scoring_together_inference scoring/test_scoring.py --env OPENAI_API_KEY=$OPENAI_API_KEY
```
<img width="745" alt="image"
src="https://github.com/user-attachments/assets/68f5cdda-f6c8-496d-8b4f-1b3dabeca9c2">
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the
other checks if that's the case).
- [ ] Ran pre-commit to handle lint / formatting issues.
- [ ] Read the [contributor
guideline](https://github.com/meta-llama/llama-stack/blob/main/CONTRIBUTING.md),
Pull Request section?
- [ ] Updated relevant documentation.
- [ ] Wrote necessary unit or integration tests.
49 lines
1.7 KiB
Python
49 lines
1.7 KiB
Python
# Copyright (c) Meta Platforms, Inc. and affiliates.
|
|
# All rights reserved.
|
|
#
|
|
# This source code is licensed under the terms described in the LICENSE file in
|
|
# the root directory of this source tree.
|
|
|
|
from typing import List
|
|
|
|
from llama_stack.distribution.datatypes import * # noqa: F403
|
|
|
|
|
|
def available_providers() -> List[ProviderSpec]:
|
|
return [
|
|
InlineProviderSpec(
|
|
api=Api.scoring,
|
|
provider_type="inline::basic",
|
|
pip_packages=[],
|
|
module="llama_stack.providers.inline.scoring.basic",
|
|
config_class="llama_stack.providers.inline.scoring.basic.BasicScoringConfig",
|
|
api_dependencies=[
|
|
Api.datasetio,
|
|
Api.datasets,
|
|
],
|
|
),
|
|
InlineProviderSpec(
|
|
api=Api.scoring,
|
|
provider_type="inline::llm-as-judge",
|
|
pip_packages=[],
|
|
module="llama_stack.providers.inline.scoring.llm_as_judge",
|
|
config_class="llama_stack.providers.inline.scoring.llm_as_judge.LlmAsJudgeScoringConfig",
|
|
api_dependencies=[
|
|
Api.datasetio,
|
|
Api.datasets,
|
|
Api.inference,
|
|
],
|
|
),
|
|
InlineProviderSpec(
|
|
api=Api.scoring,
|
|
provider_type="inline::braintrust",
|
|
pip_packages=["autoevals", "openai"],
|
|
module="llama_stack.providers.inline.scoring.braintrust",
|
|
config_class="llama_stack.providers.inline.scoring.braintrust.BraintrustScoringConfig",
|
|
api_dependencies=[
|
|
Api.datasetio,
|
|
Api.datasets,
|
|
],
|
|
provider_data_validator="llama_stack.providers.inline.scoring.braintrust.BraintrustProviderDataValidator",
|
|
),
|
|
]
|