docs(mcp): add a few lines for how to specify Auth headers in MCP tools (#2336)

This commit is contained in:
Ashwin Bharambe 2025-06-02 14:28:38 -07:00 committed by GitHub
parent 6bb174bb05
commit 76dcf47320
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
7 changed files with 134 additions and 102 deletions

View file

@ -1,4 +1,4 @@
# Evaluation Concepts
## Evaluation Concepts
The Llama Stack Evaluation flow allows you to run evaluations on your GenAI application datasets or pre-registered benchmarks.
@ -10,11 +10,7 @@ We introduce a set of APIs in Llama Stack for supporting running evaluations of
This guide goes over the sets of APIs and developer experience flow of using Llama Stack to run evaluations for different use cases. Checkout our Colab notebook on working examples with evaluations [here](https://colab.research.google.com/drive/10CHyykee9j2OigaIcRv47BKG9mrNm0tJ?usp=sharing).
## Evaluation Concepts
The Evaluation APIs are associated with a set of Resources as shown in the following diagram. Please visit the Resources section in our [Core Concepts](../concepts/index.md) guide for better high-level understanding.
![Eval Concepts](../references/evals_reference/resources/eval-concept.png)
The Evaluation APIs are associated with a set of Resources. Please visit the Resources section in our [Core Concepts](../concepts/index.md) guide for better high-level understanding.
- **DatasetIO**: defines interface with datasets and data loaders.
- Associated with `Dataset` resource.
@ -24,9 +20,9 @@ The Evaluation APIs are associated with a set of Resources as shown in the follo
- Associated with `Benchmark` resource.
## Open-benchmark Eval
### Open-benchmark Eval
### List of open-benchmarks Llama Stack support
#### List of open-benchmarks Llama Stack support
Llama stack pre-registers several popular open-benchmarks to easily evaluate model perfomance via CLI.
@ -39,7 +35,7 @@ The list of open-benchmarks we currently support:
You can follow this [contributing guide](https://llama-stack.readthedocs.io/en/latest/references/evals_reference/index.html#open-benchmark-contributing-guide) to add more open-benchmarks to Llama Stack
### Run evaluation on open-benchmarks via CLI
#### Run evaluation on open-benchmarks via CLI
We have built-in functionality to run the supported open-benckmarks using llama-stack-client CLI
@ -74,7 +70,7 @@ evaluation results over there.
## What's Next?
#### What's Next?
- Check out our Colab notebook on working examples with running benchmark evaluations [here](https://colab.research.google.com/github/meta-llama/llama-stack/blob/main/docs/notebooks/Llama_Stack_Benchmark_Evals.ipynb#scrollTo=mxLCsP4MvFqP).
- Check out our [Building Applications - Evaluation](../building_applications/evals.md) guide for more details on how to use the Evaluation APIs to evaluate your applications.