docs(mcp): add a few lines for how to specify Auth headers in MCP tools

This commit is contained in:
Ashwin Bharambe 2025-06-02 13:58:58 -07:00
parent 2603f10f95
commit 8dcdce317d
7 changed files with 134 additions and 102 deletions

View file

@ -1,4 +1,4 @@
# Evaluation Concepts
## Evaluation Concepts
The Llama Stack Evaluation flow allows you to run evaluations on your GenAI application datasets or pre-registered benchmarks.
@ -10,11 +10,7 @@ We introduce a set of APIs in Llama Stack for supporting running evaluations of
This guide goes over the sets of APIs and developer experience flow of using Llama Stack to run evaluations for different use cases. Checkout our Colab notebook on working examples with evaluations [here](https://colab.research.google.com/drive/10CHyykee9j2OigaIcRv47BKG9mrNm0tJ?usp=sharing).
## Evaluation Concepts
The Evaluation APIs are associated with a set of Resources as shown in the following diagram. Please visit the Resources section in our [Core Concepts](../concepts/index.md) guide for better high-level understanding.
![Eval Concepts](../references/evals_reference/resources/eval-concept.png)
The Evaluation APIs are associated with a set of Resources. Please visit the Resources section in our [Core Concepts](../concepts/index.md) guide for better high-level understanding.
- **DatasetIO**: defines interface with datasets and data loaders.
- Associated with `Dataset` resource.
@ -24,9 +20,9 @@ The Evaluation APIs are associated with a set of Resources as shown in the follo
- Associated with `Benchmark` resource.
## Open-benchmark Eval
### Open-benchmark Eval
### List of open-benchmarks Llama Stack support
#### List of open-benchmarks Llama Stack support
Llama stack pre-registers several popular open-benchmarks to easily evaluate model perfomance via CLI.
@ -39,7 +35,7 @@ The list of open-benchmarks we currently support:
You can follow this [contributing guide](https://llama-stack.readthedocs.io/en/latest/references/evals_reference/index.html#open-benchmark-contributing-guide) to add more open-benchmarks to Llama Stack
### Run evaluation on open-benchmarks via CLI
#### Run evaluation on open-benchmarks via CLI
We have built-in functionality to run the supported open-benckmarks using llama-stack-client CLI
@ -74,7 +70,7 @@ evaluation results over there.
## What's Next?
#### What's Next?
- Check out our Colab notebook on working examples with running benchmark evaluations [here](https://colab.research.google.com/github/meta-llama/llama-stack/blob/main/docs/notebooks/Llama_Stack_Benchmark_Evals.ipynb#scrollTo=mxLCsP4MvFqP).
- Check out our [Building Applications - Evaluation](../building_applications/evals.md) guide for more details on how to use the Evaluation APIs to evaluate your applications.