mirror of
https://github.com/meta-llama/llama-stack.git
synced 2025-12-28 03:01:59 +00:00
Merge branch 'main' into feat/litellm_sambanova_usage
This commit is contained in:
commit
5bd1bd30e2
76 changed files with 3534 additions and 2843 deletions
|
|
@ -6,7 +6,7 @@ This guide will walk you through the process of adding a new API provider to Lla
|
|||
- Begin by reviewing the [core concepts](../concepts/index.md) of Llama Stack and choose the API your provider belongs to (Inference, Safety, VectorIO, etc.)
|
||||
- Determine the provider type ({repopath}`Remote::llama_stack/providers/remote` or {repopath}`Inline::llama_stack/providers/inline`). Remote providers make requests to external services, while inline providers execute implementation locally.
|
||||
- Add your provider to the appropriate {repopath}`Registry::llama_stack/providers/registry/`. Specify pip dependencies necessary.
|
||||
- Update any distribution {repopath}`Templates::llama_stack/templates/` build.yaml and run.yaml files if they should include your provider by default. Run {repopath}`llama_stack/scripts/distro_codegen.py` if necessary. Note that `distro_codegen.py` will fail if the new provider causes any distribution template to attempt to import provider-specific dependencies. This usually means the distribution's `get_distribution_template()` code path should only import any necessary Config or model alias definitions from each provider and not the provider's actual implementation.
|
||||
- Update any distribution {repopath}`Templates::llama_stack/templates/` build.yaml and run.yaml files if they should include your provider by default. Run {repopath}`./scripts/distro_codegen.py` if necessary. Note that `distro_codegen.py` will fail if the new provider causes any distribution template to attempt to import provider-specific dependencies. This usually means the distribution's `get_distribution_template()` code path should only import any necessary Config or model alias definitions from each provider and not the provider's actual implementation.
|
||||
|
||||
|
||||
Here are some example PRs to help you get started:
|
||||
|
|
|
|||
|
|
@ -185,8 +185,12 @@ llama stack build --config llama_stack/templates/ollama/build.yaml
|
|||
:::
|
||||
|
||||
:::{tab-item} Building Container
|
||||
> [!TIP]
|
||||
> Podman is supported as an alternative to Docker. Set `CONTAINER_BINARY` to `podman` in your environment to use Podman.
|
||||
|
||||
```{admonition} Podman Alternative
|
||||
:class: tip
|
||||
|
||||
Podman is supported as an alternative to Docker. Set `CONTAINER_BINARY` to `podman` in your environment to use Podman.
|
||||
```
|
||||
|
||||
To build a container image, you may start off from a template and use the `--image-type container` flag to specify `container` as the build image type.
|
||||
|
||||
|
|
|
|||
|
|
@ -6,13 +6,13 @@ The `llamastack/distribution-nvidia` distribution consists of the following prov
|
|||
| API | Provider(s) |
|
||||
|-----|-------------|
|
||||
| agents | `inline::meta-reference` |
|
||||
| datasetio | `remote::huggingface`, `inline::localfs` |
|
||||
| datasetio | `inline::localfs` |
|
||||
| eval | `inline::meta-reference` |
|
||||
| inference | `remote::nvidia` |
|
||||
| safety | `inline::llama-guard` |
|
||||
| scoring | `inline::basic`, `inline::llm-as-judge`, `inline::braintrust` |
|
||||
| safety | `remote::nvidia` |
|
||||
| scoring | `inline::basic` |
|
||||
| telemetry | `inline::meta-reference` |
|
||||
| tool_runtime | `remote::brave-search`, `remote::tavily-search`, `inline::code-interpreter`, `inline::rag-runtime`, `remote::model-context-protocol` |
|
||||
| tool_runtime | `inline::rag-runtime` |
|
||||
| vector_io | `inline::faiss` |
|
||||
|
||||
|
||||
|
|
@ -20,8 +20,10 @@ The `llamastack/distribution-nvidia` distribution consists of the following prov
|
|||
|
||||
The following environment variables can be configured:
|
||||
|
||||
- `LLAMASTACK_PORT`: Port for the Llama Stack distribution server (default: `5001`)
|
||||
- `NVIDIA_API_KEY`: NVIDIA API Key (default: ``)
|
||||
- `GUARDRAILS_SERVICE_URL`: URL for the NeMo Guardrails Service (default: `http://0.0.0.0:7331`)
|
||||
- `INFERENCE_MODEL`: Inference model (default: `Llama3.1-8B-Instruct`)
|
||||
- `SAFETY_MODEL`: Name of the model to use for safety (default: `meta/llama-3.1-8b-instruct`)
|
||||
|
||||
### Models
|
||||
|
||||
|
|
|
|||
|
|
@ -92,6 +92,8 @@ Interactive pages for users to play with and explore Llama Stack API capabilitie
|
|||
|
||||
## Starting the Llama Stack Playground
|
||||
|
||||
### Llama CLI
|
||||
|
||||
To start the Llama Stack Playground, run the following commands:
|
||||
|
||||
1. Start up the Llama Stack API server
|
||||
|
|
@ -107,3 +109,28 @@ cd llama_stack/distribution/ui
|
|||
pip install -r requirements.txt
|
||||
streamlit run app.py
|
||||
```
|
||||
|
||||
### Docker
|
||||
|
||||
Playground can also be started in a docker image:
|
||||
|
||||
```sh
|
||||
export LLAMA_STACK_URL=http://localhost:11434
|
||||
|
||||
docker run \
|
||||
-p 8501:8501 \
|
||||
-e LLAMA_STACK_ENDPOINT=$LLAMA_STACK_URL \
|
||||
quay.io/jland/llama-stack-playground
|
||||
```
|
||||
|
||||
## Configurable Environment Variables
|
||||
|
||||
## Environment Variables
|
||||
|
||||
| Environment Variable | Description | Default Value |
|
||||
|----------------------------|------------------------------------|---------------------------|
|
||||
| LLAMA_STACK_ENDPOINT | The endpoint for the Llama Stack | http://localhost:8321 |
|
||||
| FIREWORKS_API_KEY | API key for Fireworks provider | (empty string) |
|
||||
| TOGETHER_API_KEY | API key for Together provider | (empty string) |
|
||||
| SAMBANOVA_API_KEY | API key for SambaNova provider | (empty string) |
|
||||
| OPENAI_API_KEY | API key for OpenAI provider | (empty string) |
|
||||
|
|
|
|||
|
|
@ -114,23 +114,17 @@ pprint(response)
|
|||
simpleqa_dataset_id = "huggingface::simpleqa"
|
||||
|
||||
_ = client.datasets.register(
|
||||
purpose="eval/messages-answer",
|
||||
source={
|
||||
"type": "uri",
|
||||
"uri": "huggingface://datasets/llamastack/simpleqa?split=train",
|
||||
},
|
||||
dataset_id=simpleqa_dataset_id,
|
||||
provider_id="huggingface",
|
||||
url={"uri": "https://huggingface.co/datasets/llamastack/simpleqa"},
|
||||
metadata={
|
||||
"path": "llamastack/simpleqa",
|
||||
"split": "train",
|
||||
},
|
||||
dataset_schema={
|
||||
"input_query": {"type": "string"},
|
||||
"expected_answer": {"type": "string"},
|
||||
"chat_completion_input": {"type": "chat_completion_input"},
|
||||
},
|
||||
)
|
||||
|
||||
eval_rows = client.datasetio.get_rows_paginated(
|
||||
eval_rows = client.datasets.iterrows(
|
||||
dataset_id=simpleqa_dataset_id,
|
||||
rows_in_page=5,
|
||||
limit=5,
|
||||
)
|
||||
```
|
||||
|
||||
|
|
@ -143,7 +137,7 @@ client.benchmarks.register(
|
|||
|
||||
response = client.eval.evaluate_rows(
|
||||
benchmark_id="meta-reference::simpleqa",
|
||||
input_rows=eval_rows.rows,
|
||||
input_rows=eval_rows.data,
|
||||
scoring_functions=["llm-as-judge::405b-simpleqa"],
|
||||
benchmark_config={
|
||||
"eval_candidate": {
|
||||
|
|
@ -191,7 +185,7 @@ agent_config = {
|
|||
|
||||
response = client.eval.evaluate_rows(
|
||||
benchmark_id="meta-reference::simpleqa",
|
||||
input_rows=eval_rows.rows,
|
||||
input_rows=eval_rows.data,
|
||||
scoring_functions=["llm-as-judge::405b-simpleqa"],
|
||||
benchmark_config={
|
||||
"eval_candidate": {
|
||||
|
|
|
|||
|
|
@ -6,17 +6,32 @@ The `llama-stack-client` CLI allows you to query information about the distribut
|
|||
|
||||
### `llama-stack-client`
|
||||
```bash
|
||||
llama-stack-client -h
|
||||
llama-stack-client
|
||||
Usage: llama-stack-client [OPTIONS] COMMAND [ARGS]...
|
||||
|
||||
usage: llama-stack-client [-h] {models,memory_banks,shields} ...
|
||||
Welcome to the LlamaStackClient CLI
|
||||
|
||||
Welcome to the LlamaStackClient CLI
|
||||
Options:
|
||||
--version Show the version and exit.
|
||||
--endpoint TEXT Llama Stack distribution endpoint
|
||||
--api-key TEXT Llama Stack distribution API key
|
||||
--config TEXT Path to config file
|
||||
--help Show this message and exit.
|
||||
|
||||
options:
|
||||
-h, --help show this help message and exit
|
||||
|
||||
subcommands:
|
||||
{models,memory_banks,shields}
|
||||
Commands:
|
||||
configure Configure Llama Stack Client CLI.
|
||||
datasets Manage datasets.
|
||||
eval Run evaluation tasks.
|
||||
eval_tasks Manage evaluation tasks.
|
||||
inference Inference (chat).
|
||||
inspect Inspect server configuration.
|
||||
models Manage GenAI models.
|
||||
post_training Post-training.
|
||||
providers Manage API providers.
|
||||
scoring_functions Manage scoring functions.
|
||||
shields Manage safety shield services.
|
||||
toolgroups Manage available tool groups.
|
||||
vector_dbs Manage vector databases.
|
||||
```
|
||||
|
||||
### `llama-stack-client configure`
|
||||
|
|
@ -127,11 +142,11 @@ llama-stack-client vector_dbs list
|
|||
llama-stack-client vector_dbs register <vector-db-id> [--provider-id <provider-id>] [--provider-vector-db-id <provider-vector-db-id>] [--embedding-model <embedding-model>] [--embedding-dimension <embedding-dimension>]
|
||||
```
|
||||
|
||||
Options:
|
||||
- `--provider-id`: Optional. Provider ID for the vector db
|
||||
- `--provider-vector-db-id`: Optional. Provider's vector db ID
|
||||
- `--embedding-model`: Optional. Embedding model to use. Default: "all-MiniLM-L6-v2"
|
||||
- `--embedding-dimension`: Optional. Dimension of embeddings. Default: 384
|
||||
Optional arguments:
|
||||
- `--provider-id`: Provider ID for the vector db
|
||||
- `--provider-vector-db-id`: Provider's vector db ID
|
||||
- `--embedding-model`: Embedding model to use. Default: "all-MiniLM-L6-v2"
|
||||
- `--embedding-dimension`: Dimension of embeddings. Default: 384
|
||||
|
||||
### `llama-stack-client vector_dbs unregister`
|
||||
```bash
|
||||
|
|
@ -157,11 +172,13 @@ llama-stack-client shields list
|
|||
llama-stack-client shields register --shield-id <shield-id> [--provider-id <provider-id>] [--provider-shield-id <provider-shield-id>] [--params <params>]
|
||||
```
|
||||
|
||||
Options:
|
||||
- `--shield-id`: Required. ID of the shield
|
||||
- `--provider-id`: Optional. Provider ID for the shield
|
||||
- `--provider-shield-id`: Optional. Provider's shield ID
|
||||
- `--params`: Optional. JSON configuration parameters for the shield
|
||||
Required arguments:
|
||||
- `--shield-id`: ID of the shield
|
||||
|
||||
Optional arguments:
|
||||
- `--provider-id`: Provider ID for the shield
|
||||
- `--provider-shield-id`: Provider's shield ID
|
||||
- `--params`: JSON configuration parameters for the shield
|
||||
|
||||
## Eval Task Management
|
||||
|
||||
|
|
@ -175,13 +192,15 @@ llama-stack-client benchmarks list
|
|||
llama-stack-client benchmarks register --eval-task-id <eval-task-id> --dataset-id <dataset-id> --scoring-functions <function1> [<function2> ...] [--provider-id <provider-id>] [--provider-eval-task-id <provider-eval-task-id>] [--metadata <metadata>]
|
||||
```
|
||||
|
||||
Options:
|
||||
- `--eval-task-id`: Required. ID of the eval task
|
||||
- `--dataset-id`: Required. ID of the dataset to evaluate
|
||||
- `--scoring-functions`: Required. One or more scoring functions to use for evaluation
|
||||
- `--provider-id`: Optional. Provider ID for the eval task
|
||||
- `--provider-eval-task-id`: Optional. Provider's eval task ID
|
||||
- `--metadata`: Optional. Metadata for the eval task in JSON format
|
||||
Required arguments:
|
||||
- `--eval-task-id`: ID of the eval task
|
||||
- `--dataset-id`: ID of the dataset to evaluate
|
||||
- `--scoring-functions`: One or more scoring functions to use for evaluation
|
||||
|
||||
Optional arguments:
|
||||
- `--provider-id`: Provider ID for the eval task
|
||||
- `--provider-eval-task-id`: Provider's eval task ID
|
||||
- `--metadata`: Metadata for the eval task in JSON format
|
||||
|
||||
## Eval execution
|
||||
### `llama-stack-client eval run-benchmark`
|
||||
|
|
@ -189,11 +208,13 @@ Options:
|
|||
llama-stack-client eval run-benchmark <eval-task-id1> [<eval-task-id2> ...] --eval-task-config <config-file> --output-dir <output-dir> [--num-examples <num>] [--visualize]
|
||||
```
|
||||
|
||||
Options:
|
||||
- `--eval-task-config`: Required. Path to the eval task config file in JSON format
|
||||
- `--output-dir`: Required. Path to the directory where evaluation results will be saved
|
||||
- `--num-examples`: Optional. Number of examples to evaluate (useful for debugging)
|
||||
- `--visualize`: Optional flag. If set, visualizes evaluation results after completion
|
||||
Required arguments:
|
||||
- `--eval-task-config`: Path to the eval task config file in JSON format
|
||||
- `--output-dir`: Path to the directory where evaluation results will be saved
|
||||
|
||||
Optional arguments:
|
||||
- `--num-examples`: Number of examples to evaluate (useful for debugging)
|
||||
- `--visualize`: If set, visualizes evaluation results after completion
|
||||
|
||||
Example benchmark_config.json:
|
||||
```json
|
||||
|
|
@ -214,11 +235,13 @@ Example benchmark_config.json:
|
|||
llama-stack-client eval run-scoring <eval-task-id> --eval-task-config <config-file> --output-dir <output-dir> [--num-examples <num>] [--visualize]
|
||||
```
|
||||
|
||||
Options:
|
||||
- `--eval-task-config`: Required. Path to the eval task config file in JSON format
|
||||
- `--output-dir`: Required. Path to the directory where scoring results will be saved
|
||||
- `--num-examples`: Optional. Number of examples to evaluate (useful for debugging)
|
||||
- `--visualize`: Optional flag. If set, visualizes scoring results after completion
|
||||
Required arguments:
|
||||
- `--eval-task-config`: Path to the eval task config file in JSON format
|
||||
- `--output-dir`: Path to the directory where scoring results will be saved
|
||||
|
||||
Optional arguments:
|
||||
- `--num-examples`: Number of examples to evaluate (useful for debugging)
|
||||
- `--visualize`: If set, visualizes scoring results after completion
|
||||
|
||||
## Tool Group Management
|
||||
|
||||
|
|
@ -230,11 +253,11 @@ llama-stack-client toolgroups list
|
|||
+---------------------------+------------------+------+---------------+
|
||||
| identifier | provider_id | args | mcp_endpoint |
|
||||
+===========================+==================+======+===============+
|
||||
| builtin::code_interpreter | code-interpreter | None | None |
|
||||
| builtin::code_interpreter | code-interpreter | None | None |
|
||||
+---------------------------+------------------+------+---------------+
|
||||
| builtin::rag | rag-runtime | None | None |
|
||||
| builtin::rag | rag-runtime | None | None |
|
||||
+---------------------------+------------------+------+---------------+
|
||||
| builtin::websearch | tavily-search | None | None |
|
||||
| builtin::websearch | tavily-search | None | None |
|
||||
+---------------------------+------------------+------+---------------+
|
||||
```
|
||||
|
||||
|
|
@ -250,11 +273,11 @@ Shows detailed information about a specific toolgroup. If the toolgroup is not f
|
|||
llama-stack-client toolgroups register <toolgroup_id> [--provider-id <provider-id>] [--provider-toolgroup-id <provider-toolgroup-id>] [--mcp-config <mcp-config>] [--args <args>]
|
||||
```
|
||||
|
||||
Options:
|
||||
- `--provider-id`: Optional. Provider ID for the toolgroup
|
||||
- `--provider-toolgroup-id`: Optional. Provider's toolgroup ID
|
||||
- `--mcp-config`: Optional. JSON configuration for the MCP endpoint
|
||||
- `--args`: Optional. JSON arguments for the toolgroup
|
||||
Optional arguments:
|
||||
- `--provider-id`: Provider ID for the toolgroup
|
||||
- `--provider-toolgroup-id`: Provider's toolgroup ID
|
||||
- `--mcp-config`: JSON configuration for the MCP endpoint
|
||||
- `--args`: JSON arguments for the toolgroup
|
||||
|
||||
### `llama-stack-client toolgroups unregister`
|
||||
```bash
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue