diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
index c8849c95e..1623d1829 100644
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@@ -2,4 +2,4 @@
# These owners will be the default owners for everything in
# the repo. Unless a later match takes precedence,
-* @ashwinb @yanxi0830 @hardikjshah @dltn @raghotham @dineshyv
+* @ashwinb @yanxi0830 @hardikjshah @dltn @raghotham @dineshyv @vladimirivic
diff --git a/README.md b/README.md
index 16ca48ecb..a1369d56a 100644
--- a/README.md
+++ b/README.md
@@ -127,7 +127,7 @@ You have two ways to install this repository:
conda activate stack
cd llama-stack
- $CONDA_PREFIX/bin/pip install -e .
+ pip install -e .
```
## Documentation
diff --git a/docs/getting_started.ipynb b/docs/getting_started.ipynb
new file mode 100644
index 000000000..fa527f1a0
--- /dev/null
+++ b/docs/getting_started.ipynb
@@ -0,0 +1,4636 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "c1e7571c",
+ "metadata": {
+ "id": "c1e7571c"
+ },
+ "source": [
+ "[](https://colab.research.google.com/drive/1F2ksmkoGQPa4pzRjMOE6BXWeOxWFIW6n?usp=sharing)\n",
+ "\n",
+ "# Llama Stack - Building AI Applications\n",
+ "\n",
+ "\n",
+ "\n",
+ "[Llama Stack](https://github.com/meta-llama/llama-stack) defines and standardizes the set of core building blocks needed to bring generative AI applications to market. These building blocks are presented in the form of interoperable APIs with a broad set of Service Providers providing their implementations.\n",
+ "\n",
+ "Read more about the project: https://llama-stack.readthedocs.io/en/latest/index.html\n",
+ "\n",
+ "In this guide, we will showcase how you can build LLM-powered agentic applications using Llama Stack.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4CV1Q19BDMVw",
+ "metadata": {
+ "id": "4CV1Q19BDMVw"
+ },
+ "source": [
+ "## 1. Getting started with Llama Stack"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "K4AvfUAJZOeS",
+ "metadata": {
+ "id": "K4AvfUAJZOeS"
+ },
+ "source": [
+ "### 1.1. Create TogetherAI account\n",
+ "\n",
+ "\n",
+ "In order to run inference for the llama models, you will need to use an inference provider. Llama stack supports a number of inference [providers](https://github.com/meta-llama/llama-stack/tree/main/llama_stack/providers/remote/inference).\n",
+ "\n",
+ "\n",
+ "In this showcase, we will use [together.ai](https://www.together.ai/) as the inference provider. So, you would first get an API key from Together if you dont have one already.\n",
+ "\n",
+ "Steps [here](https://docs.google.com/document/d/1Vg998IjRW_uujAPnHdQ9jQWvtmkZFt74FldW2MblxPY/edit?usp=sharing).\n",
+ "\n",
+ "You can also use Fireworks.ai or even Ollama if you would like to.\n",
+ "\n",
+ "\n",
+ "\n",
+ "> **Note:** Set the API Key in the Secrets of this notebook\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "oDUB7M_qe-Gs",
+ "metadata": {
+ "id": "oDUB7M_qe-Gs"
+ },
+ "source": [
+ "### 1.2. Install Llama Stack\n",
+ "\n",
+ "We will now start with installing the [llama-stack pypi package](https://pypi.org/project/llama-stack).\n",
+ "\n",
+ "In addition, we will install [bubblewrap](https://github.com/containers/bubblewrap), a low level light-weight container framework that runs in the user namespace. We will use it to execute code generated by Llama in one of the examples."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 42,
+ "id": "J2kGed0R5PSf",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "collapsed": true,
+ "id": "J2kGed0R5PSf",
+ "outputId": "7d543c6f-623d-4911-b9a7-4ed24d5b82f2"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Reading package lists... Done\n",
+ "Building dependency tree... Done\n",
+ "Reading state information... Done\n",
+ "bubblewrap is already the newest version (0.6.1-1ubuntu0.1).\n",
+ "0 upgraded, 0 newly installed, 0 to remove and 49 not upgraded.\n",
+ "Requirement already satisfied: llama-stack in /usr/local/lib/python3.10/dist-packages (0.0.61)\n",
+ "Requirement already satisfied: blobfile in /usr/local/lib/python3.10/dist-packages (from llama-stack) (3.0.0)\n",
+ "Requirement already satisfied: fire in /usr/local/lib/python3.10/dist-packages (from llama-stack) (0.7.0)\n",
+ "Requirement already satisfied: httpx in /usr/local/lib/python3.10/dist-packages (from llama-stack) (0.28.1)\n",
+ "Requirement already satisfied: huggingface-hub in /usr/local/lib/python3.10/dist-packages (from llama-stack) (0.26.5)\n",
+ "Requirement already satisfied: llama-models>=0.0.61 in /usr/local/lib/python3.10/dist-packages (from llama-stack) (0.0.61)\n",
+ "Requirement already satisfied: llama-stack-client>=0.0.61 in /usr/local/lib/python3.10/dist-packages (from llama-stack) (0.0.61)\n",
+ "Requirement already satisfied: prompt-toolkit in /usr/local/lib/python3.10/dist-packages (from llama-stack) (3.0.48)\n",
+ "Requirement already satisfied: python-dotenv in /usr/local/lib/python3.10/dist-packages (from llama-stack) (1.0.1)\n",
+ "Requirement already satisfied: pydantic>=2 in /usr/local/lib/python3.10/dist-packages (from llama-stack) (2.10.3)\n",
+ "Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from llama-stack) (2.32.3)\n",
+ "Requirement already satisfied: rich in /usr/local/lib/python3.10/dist-packages (from llama-stack) (13.9.4)\n",
+ "Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from llama-stack) (75.1.0)\n",
+ "Requirement already satisfied: termcolor in /usr/local/lib/python3.10/dist-packages (from llama-stack) (2.5.0)\n",
+ "Requirement already satisfied: PyYAML in /usr/local/lib/python3.10/dist-packages (from llama-models>=0.0.61->llama-stack) (6.0.2)\n",
+ "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from llama-models>=0.0.61->llama-stack) (3.1.4)\n",
+ "Requirement already satisfied: tiktoken in /usr/local/lib/python3.10/dist-packages (from llama-models>=0.0.61->llama-stack) (0.8.0)\n",
+ "Requirement already satisfied: Pillow in /usr/local/lib/python3.10/dist-packages (from llama-models>=0.0.61->llama-stack) (10.4.0)\n",
+ "Requirement already satisfied: anyio<5,>=3.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-stack-client>=0.0.61->llama-stack) (3.7.1)\n",
+ "Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from llama-stack-client>=0.0.61->llama-stack) (8.1.7)\n",
+ "Requirement already satisfied: distro<2,>=1.7.0 in /usr/local/lib/python3.10/dist-packages (from llama-stack-client>=0.0.61->llama-stack) (1.9.0)\n",
+ "Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from llama-stack-client>=0.0.61->llama-stack) (2.2.2)\n",
+ "Requirement already satisfied: pyaml in /usr/local/lib/python3.10/dist-packages (from llama-stack-client>=0.0.61->llama-stack) (24.12.1)\n",
+ "Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from llama-stack-client>=0.0.61->llama-stack) (1.3.1)\n",
+ "Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from llama-stack-client>=0.0.61->llama-stack) (4.66.6)\n",
+ "Requirement already satisfied: typing-extensions<5,>=4.7 in /usr/local/lib/python3.10/dist-packages (from llama-stack-client>=0.0.61->llama-stack) (4.12.2)\n",
+ "Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from httpx->llama-stack) (2024.8.30)\n",
+ "Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.10/dist-packages (from httpx->llama-stack) (1.0.7)\n",
+ "Requirement already satisfied: idna in /usr/local/lib/python3.10/dist-packages (from httpx->llama-stack) (3.10)\n",
+ "Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.10/dist-packages (from httpcore==1.*->httpx->llama-stack) (0.14.0)\n",
+ "Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.10/dist-packages (from pydantic>=2->llama-stack) (0.7.0)\n",
+ "Requirement already satisfied: pydantic-core==2.27.1 in /usr/local/lib/python3.10/dist-packages (from pydantic>=2->llama-stack) (2.27.1)\n",
+ "Requirement already satisfied: pycryptodomex>=3.8 in /usr/local/lib/python3.10/dist-packages (from blobfile->llama-stack) (3.21.0)\n",
+ "Requirement already satisfied: urllib3<3,>=1.25.3 in /usr/local/lib/python3.10/dist-packages (from blobfile->llama-stack) (2.2.3)\n",
+ "Requirement already satisfied: lxml>=4.9 in /usr/local/lib/python3.10/dist-packages (from blobfile->llama-stack) (5.3.0)\n",
+ "Requirement already satisfied: filelock>=3.0 in /usr/local/lib/python3.10/dist-packages (from blobfile->llama-stack) (3.16.1)\n",
+ "Requirement already satisfied: fsspec>=2023.5.0 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub->llama-stack) (2024.9.0)\n",
+ "Requirement already satisfied: packaging>=20.9 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub->llama-stack) (24.2)\n",
+ "Requirement already satisfied: wcwidth in /usr/local/lib/python3.10/dist-packages (from prompt-toolkit->llama-stack) (0.2.13)\n",
+ "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->llama-stack) (3.4.0)\n",
+ "Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.10/dist-packages (from rich->llama-stack) (3.0.0)\n",
+ "Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from rich->llama-stack) (2.18.0)\n",
+ "Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.5.0->llama-stack-client>=0.0.61->llama-stack) (1.2.2)\n",
+ "Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.10/dist-packages (from markdown-it-py>=2.2.0->rich->llama-stack) (0.1.2)\n",
+ "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->llama-models>=0.0.61->llama-stack) (3.0.2)\n",
+ "Requirement already satisfied: numpy>=1.22.4 in /usr/local/lib/python3.10/dist-packages (from pandas->llama-stack-client>=0.0.61->llama-stack) (1.26.4)\n",
+ "Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas->llama-stack-client>=0.0.61->llama-stack) (2.8.2)\n",
+ "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->llama-stack-client>=0.0.61->llama-stack) (2024.2)\n",
+ "Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.10/dist-packages (from pandas->llama-stack-client>=0.0.61->llama-stack) (2024.2)\n",
+ "Requirement already satisfied: regex>=2022.1.18 in /usr/local/lib/python3.10/dist-packages (from tiktoken->llama-models>=0.0.61->llama-stack) (2024.9.11)\n",
+ "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.2->pandas->llama-stack-client>=0.0.61->llama-stack) (1.17.0)\n"
+ ]
+ }
+ ],
+ "source": [
+ "!apt-get install -y bubblewrap\n",
+ "!pip install -U llama-stack"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "414301dc",
+ "metadata": {
+ "id": "414301dc"
+ },
+ "source": [
+ "### 1.3. Configure Llama Stack for Together\n",
+ "\n",
+ "\n",
+ "Llama Stack is architected as a collection of lego blocks which can be assembled as needed.\n",
+ "\n",
+ "\n",
+ "Typically, llama stack is available as a server with an endpoint that you can hit. We call this endpoint a [Distribution](https://llama-stack.readthedocs.io/en/latest/concepts/index.html#distributions). Partners like Together and Fireworks offer their own Llama Stack Distribution endpoints.\n",
+ "\n",
+ "In this showcase, we are going to use llama stack inline as a library. So, given a particular set of providers, we must first package up the right set of dependencies. We have a template to use Together as an inference provider and [faiss](https://ai.meta.com/tools/faiss/) for memory/RAG.\n",
+ "\n",
+ "We will run `llama stack build` to deploy all dependencies."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 43,
+ "id": "HaepEZXCDgif",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "collapsed": true,
+ "id": "HaepEZXCDgif",
+ "outputId": "9c268d26-7444-4741-f14d-3911eea8e4eb"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Requirement already satisfied: llama-stack in /usr/local/lib/python3.10/dist-packages (0.0.61)\r\n",
+ "Requirement already satisfied: blobfile in /usr/local/lib/python3.10/dist-packages (from llama-stack) (3.0.0)\r\n",
+ "Requirement already satisfied: fire in /usr/local/lib/python3.10/dist-packages (from llama-stack) (0.7.0)\r\n",
+ "Requirement already satisfied: httpx in /usr/local/lib/python3.10/dist-packages (from llama-stack) (0.28.1)\r\n",
+ "Requirement already satisfied: huggingface-hub in /usr/local/lib/python3.10/dist-packages (from llama-stack) (0.26.5)\r\n",
+ "Requirement already satisfied: llama-models>=0.0.61 in /usr/local/lib/python3.10/dist-packages (from llama-stack) (0.0.61)\r\n",
+ "Requirement already satisfied: llama-stack-client>=0.0.61 in /usr/local/lib/python3.10/dist-packages (from llama-stack) (0.0.61)\r\n",
+ "Requirement already satisfied: prompt-toolkit in /usr/local/lib/python3.10/dist-packages (from llama-stack) (3.0.48)\r\n",
+ "Requirement already satisfied: python-dotenv in /usr/local/lib/python3.10/dist-packages (from llama-stack) (1.0.1)\r\n",
+ "Requirement already satisfied: pydantic>=2 in /usr/local/lib/python3.10/dist-packages (from llama-stack) (2.10.3)\r\n",
+ "Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from llama-stack) (2.32.3)\r\n",
+ "Requirement already satisfied: rich in /usr/local/lib/python3.10/dist-packages (from llama-stack) (13.9.4)\r\n",
+ "Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from llama-stack) (75.1.0)\r\n",
+ "Requirement already satisfied: termcolor in /usr/local/lib/python3.10/dist-packages (from llama-stack) (2.5.0)\r\n",
+ "Requirement already satisfied: PyYAML in /usr/local/lib/python3.10/dist-packages (from llama-models>=0.0.61->llama-stack) (6.0.2)\r\n",
+ "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from llama-models>=0.0.61->llama-stack) (3.1.4)\r\n",
+ "Requirement already satisfied: tiktoken in /usr/local/lib/python3.10/dist-packages (from llama-models>=0.0.61->llama-stack) (0.8.0)\r\n",
+ "Requirement already satisfied: Pillow in /usr/local/lib/python3.10/dist-packages (from llama-models>=0.0.61->llama-stack) (10.4.0)\r\n",
+ "Requirement already satisfied: anyio<5,>=3.5.0 in /usr/local/lib/python3.10/dist-packages (from llama-stack-client>=0.0.61->llama-stack) (3.7.1)\r\n",
+ "Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from llama-stack-client>=0.0.61->llama-stack) (8.1.7)\r\n",
+ "Requirement already satisfied: distro<2,>=1.7.0 in /usr/local/lib/python3.10/dist-packages (from llama-stack-client>=0.0.61->llama-stack) (1.9.0)\r\n",
+ "Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from llama-stack-client>=0.0.61->llama-stack) (2.2.2)\r\n",
+ "Requirement already satisfied: pyaml in /usr/local/lib/python3.10/dist-packages (from llama-stack-client>=0.0.61->llama-stack) (24.12.1)\r\n",
+ "Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from llama-stack-client>=0.0.61->llama-stack) (1.3.1)\r\n",
+ "Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from llama-stack-client>=0.0.61->llama-stack) (4.66.6)\r\n",
+ "Requirement already satisfied: typing-extensions<5,>=4.7 in /usr/local/lib/python3.10/dist-packages (from llama-stack-client>=0.0.61->llama-stack) (4.12.2)\r\n",
+ "Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from httpx->llama-stack) (2024.8.30)\r\n",
+ "Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.10/dist-packages (from httpx->llama-stack) (1.0.7)\r\n",
+ "Requirement already satisfied: idna in /usr/local/lib/python3.10/dist-packages (from httpx->llama-stack) (3.10)\r\n",
+ "Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.10/dist-packages (from httpcore==1.*->httpx->llama-stack) (0.14.0)\r\n",
+ "Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.10/dist-packages (from pydantic>=2->llama-stack) (0.7.0)\r\n",
+ "Requirement already satisfied: pydantic-core==2.27.1 in /usr/local/lib/python3.10/dist-packages (from pydantic>=2->llama-stack) (2.27.1)\r\n",
+ "Requirement already satisfied: pycryptodomex>=3.8 in /usr/local/lib/python3.10/dist-packages (from blobfile->llama-stack) (3.21.0)\r\n",
+ "Requirement already satisfied: urllib3<3,>=1.25.3 in /usr/local/lib/python3.10/dist-packages (from blobfile->llama-stack) (2.2.3)\r\n",
+ "Requirement already satisfied: lxml>=4.9 in /usr/local/lib/python3.10/dist-packages (from blobfile->llama-stack) (5.3.0)\r\n",
+ "Requirement already satisfied: filelock>=3.0 in /usr/local/lib/python3.10/dist-packages (from blobfile->llama-stack) (3.16.1)\n",
+ "Requirement already satisfied: fsspec>=2023.5.0 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub->llama-stack) (2024.9.0)\n",
+ "Requirement already satisfied: packaging>=20.9 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub->llama-stack) (24.2)\n",
+ "Requirement already satisfied: wcwidth in /usr/local/lib/python3.10/dist-packages (from prompt-toolkit->llama-stack) (0.2.13)\n",
+ "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->llama-stack) (3.4.0)\n",
+ "Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.10/dist-packages (from rich->llama-stack) (3.0.0)\n",
+ "Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from rich->llama-stack) (2.18.0)\n",
+ "Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.5.0->llama-stack-client>=0.0.61->llama-stack) (1.2.2)\n",
+ "Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.10/dist-packages (from markdown-it-py>=2.2.0->rich->llama-stack) (0.1.2)\n",
+ "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->llama-models>=0.0.61->llama-stack) (3.0.2)\n",
+ "Requirement already satisfied: numpy>=1.22.4 in /usr/local/lib/python3.10/dist-packages (from pandas->llama-stack-client>=0.0.61->llama-stack) (1.26.4)\n",
+ "Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas->llama-stack-client>=0.0.61->llama-stack) (2.8.2)\n",
+ "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->llama-stack-client>=0.0.61->llama-stack) (2024.2)\n",
+ "Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.10/dist-packages (from pandas->llama-stack-client>=0.0.61->llama-stack) (2024.2)\n",
+ "Requirement already satisfied: regex>=2022.1.18 in /usr/local/lib/python3.10/dist-packages (from tiktoken->llama-models>=0.0.61->llama-stack) (2024.9.11)\n",
+ "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.2->pandas->llama-stack-client>=0.0.61->llama-stack) (1.17.0)\n",
+ "Installing pip dependencies\n",
+ "Requirement already satisfied: pillow in /usr/local/lib/python3.10/dist-packages (10.4.0)\n",
+ "Requirement already satisfied: transformers in /usr/local/lib/python3.10/dist-packages (4.46.3)\n",
+ "Requirement already satisfied: psycopg2-binary in /usr/local/lib/python3.10/dist-packages (2.9.10)\n",
+ "Requirement already satisfied: aiosqlite in /usr/local/lib/python3.10/dist-packages (0.20.0)\n",
+ "Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (4.66.6)\n",
+ "Requirement already satisfied: pypdf in /usr/local/lib/python3.10/dist-packages (5.1.0)\n",
+ "Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (1.26.4)\n",
+ "Requirement already satisfied: scikit-learn in /usr/local/lib/python3.10/dist-packages (1.5.2)\n",
+ "Requirement already satisfied: redis in /usr/local/lib/python3.10/dist-packages (5.2.1)\n",
+ "Requirement already satisfied: opentelemetry-sdk in /usr/local/lib/python3.10/dist-packages (1.28.2)\n",
+ "Requirement already satisfied: sentencepiece in /usr/local/lib/python3.10/dist-packages (0.2.0)\n",
+ "Requirement already satisfied: blobfile in /usr/local/lib/python3.10/dist-packages (3.0.0)\n",
+ "Requirement already satisfied: together in /usr/local/lib/python3.10/dist-packages (1.3.5)\n",
+ "Requirement already satisfied: openai in /usr/local/lib/python3.10/dist-packages (1.54.5)\n",
+ "Requirement already satisfied: faiss-cpu in /usr/local/lib/python3.10/dist-packages (1.9.0.post1)\n",
+ "Requirement already satisfied: autoevals in /usr/local/lib/python3.10/dist-packages (0.0.110)\n",
+ "Requirement already satisfied: chardet in /usr/local/lib/python3.10/dist-packages (5.2.0)\n",
+ "Requirement already satisfied: nltk in /usr/local/lib/python3.10/dist-packages (3.9.1)\n",
+ "Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (2.2.2)\n",
+ "Requirement already satisfied: opentelemetry-exporter-otlp-proto-http in /usr/local/lib/python3.10/dist-packages (1.28.2)\n",
+ "Requirement already satisfied: datasets in /usr/local/lib/python3.10/dist-packages (3.2.0)\n",
+ "Requirement already satisfied: matplotlib in /usr/local/lib/python3.10/dist-packages (3.8.0)\n",
+ "Requirement already satisfied: scipy in /usr/local/lib/python3.10/dist-packages (1.13.1)\n",
+ "Requirement already satisfied: chromadb-client in /usr/local/lib/python3.10/dist-packages (0.5.23)\n",
+ "Requirement already satisfied: fastapi in /usr/local/lib/python3.10/dist-packages (0.115.6)\n",
+ "Requirement already satisfied: fire in /usr/local/lib/python3.10/dist-packages (0.7.0)\n",
+ "Requirement already satisfied: httpx in /usr/local/lib/python3.10/dist-packages (0.28.1)\n",
+ "Requirement already satisfied: uvicorn in /usr/local/lib/python3.10/dist-packages (0.32.1)\n",
+ "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from transformers) (3.16.1)\n",
+ "Requirement already satisfied: huggingface-hub<1.0,>=0.23.2 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.26.5)\n",
+ "Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from transformers) (24.2)\n",
+ "Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (6.0.2)\n",
+ "Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers) (2024.9.11)\n",
+ "Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers) (2.32.3)\n",
+ "Requirement already satisfied: tokenizers<0.21,>=0.20 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.20.3)\n",
+ "Requirement already satisfied: safetensors>=0.4.1 in /usr/local/lib/python3.10/dist-packages (from transformers) (0.4.5)\n",
+ "Requirement already satisfied: typing_extensions>=4.0 in /usr/local/lib/python3.10/dist-packages (from aiosqlite) (4.12.2)\n",
+ "Requirement already satisfied: joblib>=1.2.0 in /usr/local/lib/python3.10/dist-packages (from scikit-learn) (1.4.2)\n",
+ "Requirement already satisfied: threadpoolctl>=3.1.0 in /usr/local/lib/python3.10/dist-packages (from scikit-learn) (3.5.0)\n",
+ "Requirement already satisfied: async-timeout>=4.0.3 in /usr/local/lib/python3.10/dist-packages (from redis) (4.0.3)\n",
+ "Requirement already satisfied: opentelemetry-api==1.28.2 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-sdk) (1.28.2)\n",
+ "Requirement already satisfied: opentelemetry-semantic-conventions==0.49b2 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-sdk) (0.49b2)\n",
+ "Requirement already satisfied: deprecated>=1.2.6 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-api==1.28.2->opentelemetry-sdk) (1.2.15)\n",
+ "Requirement already satisfied: importlib-metadata<=8.5.0,>=6.0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-api==1.28.2->opentelemetry-sdk) (8.5.0)\n",
+ "Requirement already satisfied: pycryptodomex>=3.8 in /usr/local/lib/python3.10/dist-packages (from blobfile) (3.21.0)\n",
+ "Requirement already satisfied: urllib3<3,>=1.25.3 in /usr/local/lib/python3.10/dist-packages (from blobfile) (2.2.3)\n",
+ "Requirement already satisfied: lxml>=4.9 in /usr/local/lib/python3.10/dist-packages (from blobfile) (5.3.0)\n",
+ "Requirement already satisfied: aiohttp<4.0.0,>=3.9.3 in /usr/local/lib/python3.10/dist-packages (from together) (3.11.10)\n",
+ "Requirement already satisfied: click<9.0.0,>=8.1.7 in /usr/local/lib/python3.10/dist-packages (from together) (8.1.7)\n",
+ "Requirement already satisfied: eval-type-backport<0.3.0,>=0.1.3 in /usr/local/lib/python3.10/dist-packages (from together) (0.2.0)\n",
+ "Requirement already satisfied: pyarrow>=10.0.1 in /usr/local/lib/python3.10/dist-packages (from together) (17.0.0)\n",
+ "Requirement already satisfied: pydantic<3.0.0,>=2.6.3 in /usr/local/lib/python3.10/dist-packages (from together) (2.10.3)\n",
+ "Requirement already satisfied: rich<14.0.0,>=13.8.1 in /usr/local/lib/python3.10/dist-packages (from together) (13.9.4)\n",
+ "Requirement already satisfied: tabulate<0.10.0,>=0.9.0 in /usr/local/lib/python3.10/dist-packages (from together) (0.9.0)\n",
+ "Requirement already satisfied: typer<0.14,>=0.9 in /usr/local/lib/python3.10/dist-packages (from together) (0.13.1)\n",
+ "Requirement already satisfied: anyio<5,>=3.5.0 in /usr/local/lib/python3.10/dist-packages (from openai) (3.7.1)\n",
+ "Requirement already satisfied: distro<2,>=1.7.0 in /usr/local/lib/python3.10/dist-packages (from openai) (1.9.0)\n",
+ "Requirement already satisfied: jiter<1,>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from openai) (0.8.2)\n",
+ "Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from openai) (1.3.1)\n",
+ "Requirement already satisfied: chevron in /usr/local/lib/python3.10/dist-packages (from autoevals) (0.14.0)\n",
+ "Requirement already satisfied: levenshtein in /usr/local/lib/python3.10/dist-packages (from autoevals) (0.26.1)\n",
+ "Requirement already satisfied: braintrust_core==0.0.54 in /usr/local/lib/python3.10/dist-packages (from autoevals) (0.0.54)\n",
+ "Requirement already satisfied: jsonschema in /usr/local/lib/python3.10/dist-packages (from autoevals) (4.23.0)\n",
+ "Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas) (2.8.2)\n",
+ "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas) (2024.2)\n",
+ "Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.10/dist-packages (from pandas) (2024.2)\n",
+ "Requirement already satisfied: googleapis-common-protos~=1.52 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-exporter-otlp-proto-http) (1.66.0)\n",
+ "Requirement already satisfied: opentelemetry-exporter-otlp-proto-common==1.28.2 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-exporter-otlp-proto-http) (1.28.2)\n",
+ "Requirement already satisfied: opentelemetry-proto==1.28.2 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-exporter-otlp-proto-http) (1.28.2)\n",
+ "Requirement already satisfied: protobuf<6.0,>=5.0 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-proto==1.28.2->opentelemetry-exporter-otlp-proto-http) (5.29.1)\n",
+ "Requirement already satisfied: dill<0.3.9,>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from datasets) (0.3.8)\n",
+ "Requirement already satisfied: xxhash in /usr/local/lib/python3.10/dist-packages (from datasets) (3.5.0)\n",
+ "Requirement already satisfied: multiprocess<0.70.17 in /usr/local/lib/python3.10/dist-packages (from datasets) (0.70.16)\n",
+ "Requirement already satisfied: fsspec<=2024.9.0,>=2023.1.0 in /usr/local/lib/python3.10/dist-packages (from fsspec[http]<=2024.9.0,>=2023.1.0->datasets) (2024.9.0)\n",
+ "Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib) (1.3.1)\n",
+ "Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib) (0.12.1)\n",
+ "Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib) (4.55.3)\n",
+ "Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib) (1.4.7)\n",
+ "Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib) (3.2.0)\n",
+ "Requirement already satisfied: opentelemetry-exporter-otlp-proto-grpc>=1.2.0 in /usr/local/lib/python3.10/dist-packages (from chromadb-client) (1.28.2)\n",
+ "Requirement already satisfied: overrides>=7.3.1 in /usr/local/lib/python3.10/dist-packages (from chromadb-client) (7.7.0)\n",
+ "Requirement already satisfied: posthog>=2.4.0 in /usr/local/lib/python3.10/dist-packages (from chromadb-client) (3.7.4)\n",
+ "Requirement already satisfied: tenacity>=8.2.3 in /usr/local/lib/python3.10/dist-packages (from chromadb-client) (9.0.0)\n",
+ "Requirement already satisfied: orjson>=3.9.12 in /usr/local/lib/python3.10/dist-packages (from chromadb-client) (3.10.12)\n",
+ "Requirement already satisfied: starlette<0.42.0,>=0.40.0 in /usr/local/lib/python3.10/dist-packages (from fastapi) (0.41.3)\n",
+ "Requirement already satisfied: termcolor in /usr/local/lib/python3.10/dist-packages (from fire) (2.5.0)\n",
+ "Requirement already satisfied: certifi in /usr/local/lib/python3.10/dist-packages (from httpx) (2024.8.30)\n",
+ "Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.10/dist-packages (from httpx) (1.0.7)\n",
+ "Requirement already satisfied: idna in /usr/local/lib/python3.10/dist-packages (from httpx) (3.10)\n",
+ "Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.10/dist-packages (from httpcore==1.*->httpx) (0.14.0)\n",
+ "Requirement already satisfied: aiohappyeyeballs>=2.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.9.3->together) (2.4.4)\n",
+ "Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.9.3->together) (1.3.1)\n",
+ "Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.9.3->together) (24.2.0)\n",
+ "Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.9.3->together) (1.5.0)\n",
+ "Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.9.3->together) (6.1.0)\n",
+ "Requirement already satisfied: propcache>=0.2.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.9.3->together) (0.2.1)\n",
+ "Requirement already satisfied: yarl<2.0,>=1.17.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.9.3->together) (1.18.3)\n",
+ "Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.5.0->openai) (1.2.2)\n",
+ "Requirement already satisfied: wrapt<2,>=1.10 in /usr/local/lib/python3.10/dist-packages (from deprecated>=1.2.6->opentelemetry-api==1.28.2->opentelemetry-sdk) (1.17.0)\n",
+ "Requirement already satisfied: grpcio<2.0.0,>=1.63.2 in /usr/local/lib/python3.10/dist-packages (from opentelemetry-exporter-otlp-proto-grpc>=1.2.0->chromadb-client) (1.68.1)\n",
+ "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from posthog>=2.4.0->chromadb-client) (1.17.0)\n",
+ "Requirement already satisfied: monotonic>=1.5 in /usr/local/lib/python3.10/dist-packages (from posthog>=2.4.0->chromadb-client) (1.6)\n",
+ "Requirement already satisfied: backoff>=1.10.0 in /usr/local/lib/python3.10/dist-packages (from posthog>=2.4.0->chromadb-client) (2.2.1)\n",
+ "Requirement already satisfied: annotated-types>=0.6.0 in /usr/local/lib/python3.10/dist-packages (from pydantic<3.0.0,>=2.6.3->together) (0.7.0)\n",
+ "Requirement already satisfied: pydantic-core==2.27.1 in /usr/local/lib/python3.10/dist-packages (from pydantic<3.0.0,>=2.6.3->together) (2.27.1)\n",
+ "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->transformers) (3.4.0)\n",
+ "Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.10/dist-packages (from rich<14.0.0,>=13.8.1->together) (3.0.0)\n",
+ "Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from rich<14.0.0,>=13.8.1->together) (2.18.0)\n",
+ "Requirement already satisfied: shellingham>=1.3.0 in /usr/local/lib/python3.10/dist-packages (from typer<0.14,>=0.9->together) (1.5.4)\n",
+ "Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /usr/local/lib/python3.10/dist-packages (from jsonschema->autoevals) (2024.10.1)\n",
+ "Requirement already satisfied: referencing>=0.28.4 in /usr/local/lib/python3.10/dist-packages (from jsonschema->autoevals) (0.35.1)\n",
+ "Requirement already satisfied: rpds-py>=0.7.1 in /usr/local/lib/python3.10/dist-packages (from jsonschema->autoevals) (0.22.3)\n",
+ "Requirement already satisfied: rapidfuzz<4.0.0,>=3.9.0 in /usr/local/lib/python3.10/dist-packages (from levenshtein->autoevals) (3.10.1)\n",
+ "Requirement already satisfied: zipp>=3.20 in /usr/local/lib/python3.10/dist-packages (from importlib-metadata<=8.5.0,>=6.0->opentelemetry-api==1.28.2->opentelemetry-sdk) (3.21.0)\n",
+ "Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.10/dist-packages (from markdown-it-py>=2.2.0->rich<14.0.0,>=13.8.1->together) (0.1.2)\n",
+ "sentence-transformers --no-deps\n",
+ "Requirement already satisfied: sentence-transformers in /usr/local/lib/python3.10/dist-packages (3.2.1)\n",
+ "torch --index-url https://download.pytorch.org/whl/cpu\n",
+ "Looking in indexes: https://download.pytorch.org/whl/cpu\n",
+ "Requirement already satisfied: torch in /usr/local/lib/python3.10/dist-packages (2.5.1+cu121)\n",
+ "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch) (3.16.1)\n",
+ "Requirement already satisfied: typing-extensions>=4.8.0 in /usr/local/lib/python3.10/dist-packages (from torch) (4.12.2)\n",
+ "Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch) (3.4.2)\n",
+ "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch) (3.1.4)\n",
+ "Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch) (2024.9.0)\n",
+ "Requirement already satisfied: sympy==1.13.1 in /usr/local/lib/python3.10/dist-packages (from torch) (1.13.1)\n",
+ "Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from sympy==1.13.1->torch) (1.3.0)\n",
+ "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch) (3.0.2)\n",
+ "\u001b[32mBuild Successful!\u001b[0m\n"
+ ]
+ }
+ ],
+ "source": [
+ "# This will build all the dependencies you will need\n",
+ "!llama stack build --template together --image-type venv"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "25b97dfe",
+ "metadata": {
+ "id": "25b97dfe"
+ },
+ "source": [
+ "### 1.4. Initialize Llama Stack\n",
+ "\n",
+ "Now that all dependencies have been installed, we can initialize llama stack. We will first set the `TOGETHER_API_KEY` environment variable\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 44,
+ "id": "E1UFuJC570Tk",
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 1000
+ },
+ "collapsed": true,
+ "id": "E1UFuJC570Tk",
+ "outputId": "bac7c9ec-ad49-4040-af43-8869f0afe5ac"
+ },
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "INFO:llama_stack.distribution.resolver:Resolved 24 providers\n",
+ "INFO:llama_stack.distribution.resolver: inner-inference => together\n",
+ "INFO:llama_stack.distribution.resolver: inner-memory => faiss\n",
+ "INFO:llama_stack.distribution.resolver: models => __routing_table__\n",
+ "INFO:llama_stack.distribution.resolver: inference => __autorouted__\n",
+ "INFO:llama_stack.distribution.resolver: inner-safety => llama-guard\n",
+ "INFO:llama_stack.distribution.resolver: shields => __routing_table__\n",
+ "INFO:llama_stack.distribution.resolver: safety => __autorouted__\n",
+ "INFO:llama_stack.distribution.resolver: memory_banks => __routing_table__\n",
+ "INFO:llama_stack.distribution.resolver: memory => __autorouted__\n",
+ "INFO:llama_stack.distribution.resolver: agents => meta-reference\n",
+ "INFO:llama_stack.distribution.resolver: inner-datasetio => huggingface\n",
+ "INFO:llama_stack.distribution.resolver: inner-datasetio => localfs\n",
+ "INFO:llama_stack.distribution.resolver: datasets => __routing_table__\n",
+ "INFO:llama_stack.distribution.resolver: datasetio => __autorouted__\n",
+ "INFO:llama_stack.distribution.resolver: telemetry => meta-reference\n",
+ "INFO:llama_stack.distribution.resolver: inner-scoring => basic\n",
+ "INFO:llama_stack.distribution.resolver: inner-scoring => llm-as-judge\n",
+ "INFO:llama_stack.distribution.resolver: inner-scoring => braintrust\n",
+ "INFO:llama_stack.distribution.resolver: scoring_functions => __routing_table__\n",
+ "INFO:llama_stack.distribution.resolver: scoring => __autorouted__\n",
+ "INFO:llama_stack.distribution.resolver: inner-eval => meta-reference\n",
+ "INFO:llama_stack.distribution.resolver: eval_tasks => __routing_table__\n",
+ "INFO:llama_stack.distribution.resolver: eval => __autorouted__\n",
+ "INFO:llama_stack.distribution.resolver: inspect => __builtin__\n",
+ "INFO:llama_stack.distribution.resolver:\n",
+ "WARNING:opentelemetry.trace:Overriding of current TracerProvider is not allowed\n",
+ "INFO:llama_stack.distribution.stack:Models: meta-llama/Llama-3.1-405B-Instruct-FP8 served by together\n",
+ "INFO:llama_stack.distribution.stack:Models: meta-llama/Llama-3.1-70B-Instruct served by together\n",
+ "INFO:llama_stack.distribution.stack:Models: meta-llama/Llama-3.1-8B-Instruct served by together\n",
+ "INFO:llama_stack.distribution.stack:Models: meta-llama/Llama-3.2-11B-Vision-Instruct served by together\n",
+ "INFO:llama_stack.distribution.stack:Models: meta-llama/Llama-3.2-3B-Instruct served by together\n",
+ "INFO:llama_stack.distribution.stack:Models: meta-llama/Llama-3.2-90B-Vision-Instruct served by together\n",
+ "INFO:llama_stack.distribution.stack:Models: meta-llama/Llama-Guard-3-11B-Vision served by together\n",
+ "INFO:llama_stack.distribution.stack:Models: meta-llama/Llama-Guard-3-8B served by together\n",
+ "INFO:llama_stack.distribution.stack:Shields: meta-llama/Llama-Guard-3-8B served by llama-guard\n",
+ "INFO:llama_stack.distribution.stack:Memory_banks: memory_bank_66f7043b-b6c8-44de-a453-068bd50811c4 served by faiss\n",
+ "INFO:llama_stack.distribution.stack:Memory_banks: memory_bank_edf0d763-95bc-40d3-93a7-95b517162cfb served by faiss\n",
+ "INFO:llama_stack.distribution.stack:Scoring_fns: basic::equality served by basic\n",
+ "INFO:llama_stack.distribution.stack:Scoring_fns: basic::regex_parser_multiple_choice_answer served by basic\n",
+ "INFO:llama_stack.distribution.stack:Scoring_fns: basic::subset_of served by basic\n",
+ "INFO:llama_stack.distribution.stack:Scoring_fns: braintrust::answer-correctness served by braintrust\n",
+ "INFO:llama_stack.distribution.stack:Scoring_fns: braintrust::factuality served by braintrust\n",
+ "INFO:llama_stack.distribution.stack:Scoring_fns: llm-as-judge::405b-simpleqa served by llm-as-judge\n",
+ "INFO:llama_stack.distribution.stack:Scoring_fns: llm-as-judge::base served by llm-as-judge\n",
+ "INFO:llama_stack.distribution.stack:\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "
Using config together:\n",
+ "
\n"
+ ],
+ "text/plain": [
+ "Using config \u001b[34mtogether\u001b[0m:\n"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "text/html": [
+ "apis:\n", + "- agents\n", + "- datasetio\n", + "- eval\n", + "- inference\n", + "- memory\n", + "- safety\n", + "- scoring\n", + "- telemetry\n", + "conda_env: together\n", + "datasets: []\n", + "docker_image: null\n", + "eval_tasks: []\n", + "image_name: together\n", + "memory_banks: []\n", + "metadata_store:\n", + " db_path: /root/.llama/distributions/together/registry.db\n", + " namespace: null\n", + " type: sqlite\n", + "models:\n", + "- metadata: {}\n", + " model_id: meta-llama/Llama-3.1-8B-Instruct\n", + " provider_id: null\n", + " provider_model_id: meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo\n", + "- metadata: {}\n", + " model_id: meta-llama/Llama-3.1-70B-Instruct\n", + " provider_id: null\n", + " provider_model_id: meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo\n", + "- metadata: {}\n", + " model_id: meta-llama/Llama-3.1-405B-Instruct-FP8\n", + " provider_id: null\n", + " provider_model_id: meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo\n", + "- metadata: {}\n", + " model_id: meta-llama/Llama-3.2-3B-Instruct\n", + " provider_id: null\n", + " provider_model_id: meta-llama/Llama-3.2-3B-Instruct-Turbo\n", + "- metadata: {}\n", + " model_id: meta-llama/Llama-3.2-11B-Vision-Instruct\n", + " provider_id: null\n", + " provider_model_id: meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo\n", + "- metadata: {}\n", + " model_id: meta-llama/Llama-3.2-90B-Vision-Instruct\n", + " provider_id: null\n", + " provider_model_id: meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo\n", + "- metadata: {}\n", + " model_id: meta-llama/Llama-Guard-3-8B\n", + " provider_id: null\n", + " provider_model_id: meta-llama/Meta-Llama-Guard-3-8B\n", + "- metadata: {}\n", + " model_id: meta-llama/Llama-Guard-3-11B-Vision\n", + " provider_id: null\n", + " provider_model_id: meta-llama/Llama-Guard-3-11B-Vision-Turbo\n", + "providers:\n", + " agents:\n", + " - config:\n", + " persistence_store:\n", + " db_path: /root/.llama/distributions/together/agents_store.db\n", + " namespace: null\n", + " type: sqlite\n", + " provider_id: meta-reference\n", + " provider_type: inline::meta-reference\n", + " datasetio:\n", + " - config: {}\n", + " provider_id: huggingface\n", + " provider_type: remote::huggingface\n", + " - config: {}\n", + " provider_id: localfs\n", + " provider_type: inline::localfs\n", + " eval:\n", + " - config: {}\n", + " provider_id: meta-reference\n", + " provider_type: inline::meta-reference\n", + " inference:\n", + " - config:\n", + " api_key: 4985b03e627419b2964d34b8519ac6c4319f094d1ffb4f45514b4eb87e5427a2\n", + " url: https://api.together.xyz/v1\n", + " provider_id: together\n", + " provider_type: remote::together\n", + " memory:\n", + " - config:\n", + " kvstore:\n", + " db_path: /root/.llama/distributions/together/faiss_store.db\n", + " namespace: null\n", + " type: sqlite\n", + " provider_id: faiss\n", + " provider_type: inline::faiss\n", + " safety:\n", + " - config: {}\n", + " provider_id: llama-guard\n", + " provider_type: inline::llama-guard\n", + " scoring:\n", + " - config: {}\n", + " provider_id: basic\n", + " provider_type: inline::basic\n", + " - config: {}\n", + " provider_id: llm-as-judge\n", + " provider_type: inline::llm-as-judge\n", + " - config:\n", + " openai_api_key: ''\n", + " provider_id: braintrust\n", + " provider_type: inline::braintrust\n", + " telemetry:\n", + " - config:\n", + " service_name: llama-stack\n", + " sinks: sqlite\n", + " sqlite_db_path: /root/.llama/distributions/together/trace_store.db\n", + " provider_id: meta-reference\n", + " provider_type: inline::meta-reference\n", + "scoring_fns: []\n", + "shields:\n", + "- params: null\n", + " provider_id: null\n", + " provider_shield_id: null\n", + " shield_id: meta-llama/Llama-Guard-3-8B\n", + "version: '2'\n", + "\n", + "\n" + ], + "text/plain": [ + "apis:\n", + "- agents\n", + "- datasetio\n", + "- eval\n", + "- inference\n", + "- memory\n", + "- safety\n", + "- scoring\n", + "- telemetry\n", + "conda_env: together\n", + "datasets: \u001b[1m[\u001b[0m\u001b[1m]\u001b[0m\n", + "docker_image: null\n", + "eval_tasks: \u001b[1m[\u001b[0m\u001b[1m]\u001b[0m\n", + "image_name: together\n", + "memory_banks: \u001b[1m[\u001b[0m\u001b[1m]\u001b[0m\n", + "metadata_store:\n", + " db_path: \u001b[35m/root/.llama/distributions/together/\u001b[0m\u001b[95mregistry.db\u001b[0m\n", + " namespace: null\n", + " type: sqlite\n", + "models:\n", + "- metadata: \u001b[1m{\u001b[0m\u001b[1m}\u001b[0m\n", + " model_id: meta-llama/Llama-\u001b[1;36m3.1\u001b[0m-8B-Instruct\n", + " provider_id: null\n", + " provider_model_id: meta-llama/Meta-Llama-\u001b[1;36m3.1\u001b[0m-8B-Instruct-Turbo\n", + "- metadata: \u001b[1m{\u001b[0m\u001b[1m}\u001b[0m\n", + " model_id: meta-llama/Llama-\u001b[1;36m3.1\u001b[0m-70B-Instruct\n", + " provider_id: null\n", + " provider_model_id: meta-llama/Meta-Llama-\u001b[1;36m3.1\u001b[0m-70B-Instruct-Turbo\n", + "- metadata: \u001b[1m{\u001b[0m\u001b[1m}\u001b[0m\n", + " model_id: meta-llama/Llama-\u001b[1;36m3.1\u001b[0m-405B-Instruct-FP8\n", + " provider_id: null\n", + " provider_model_id: meta-llama/Meta-Llama-\u001b[1;36m3.1\u001b[0m-405B-Instruct-Turbo\n", + "- metadata: \u001b[1m{\u001b[0m\u001b[1m}\u001b[0m\n", + " model_id: meta-llama/Llama-\u001b[1;36m3.2\u001b[0m-3B-Instruct\n", + " provider_id: null\n", + " provider_model_id: meta-llama/Llama-\u001b[1;36m3.2\u001b[0m-3B-Instruct-Turbo\n", + "- metadata: \u001b[1m{\u001b[0m\u001b[1m}\u001b[0m\n", + " model_id: meta-llama/Llama-\u001b[1;36m3.2\u001b[0m-11B-Vision-Instruct\n", + " provider_id: null\n", + " provider_model_id: meta-llama/Llama-\u001b[1;36m3.2\u001b[0m-11B-Vision-Instruct-Turbo\n", + "- metadata: \u001b[1m{\u001b[0m\u001b[1m}\u001b[0m\n", + " model_id: meta-llama/Llama-\u001b[1;36m3.2\u001b[0m-90B-Vision-Instruct\n", + " provider_id: null\n", + " provider_model_id: meta-llama/Llama-\u001b[1;36m3.2\u001b[0m-90B-Vision-Instruct-Turbo\n", + "- metadata: \u001b[1m{\u001b[0m\u001b[1m}\u001b[0m\n", + " model_id: meta-llama/Llama-Guard-\u001b[1;36m3\u001b[0m-8B\n", + " provider_id: null\n", + " provider_model_id: meta-llama/Meta-Llama-Guard-\u001b[1;36m3\u001b[0m-8B\n", + "- metadata: \u001b[1m{\u001b[0m\u001b[1m}\u001b[0m\n", + " model_id: meta-llama/Llama-Guard-\u001b[1;36m3\u001b[0m-11B-Vision\n", + " provider_id: null\n", + " provider_model_id: meta-llama/Llama-Guard-\u001b[1;36m3\u001b[0m-11B-Vision-Turbo\n", + "providers:\n", + " agents:\n", + " - config:\n", + " persistence_store:\n", + " db_path: \u001b[35m/root/.llama/distributions/together/\u001b[0m\u001b[95magents_store.db\u001b[0m\n", + " namespace: null\n", + " type: sqlite\n", + " provider_id: meta-reference\n", + " provider_type: inline::meta-reference\n", + " datasetio:\n", + " - config: \u001b[1m{\u001b[0m\u001b[1m}\u001b[0m\n", + " provider_id: huggingface\n", + " provider_type: remote::huggingface\n", + " - config: \u001b[1m{\u001b[0m\u001b[1m}\u001b[0m\n", + " provider_id: localfs\n", + " provider_type: inline::localfs\n", + " eval:\n", + " - config: \u001b[1m{\u001b[0m\u001b[1m}\u001b[0m\n", + " provider_id: meta-reference\n", + " provider_type: inline::meta-reference\n", + " inference:\n", + " - config:\n", + " api_key: 4985b03e627419b2964d34b8519ac6c4319f094d1ffb4f45514b4eb87e5427a2\n", + " url: \u001b[4;94mhttps://api.together.xyz/v1\u001b[0m\n", + " provider_id: together\n", + " provider_type: remote::together\n", + " memory:\n", + " - config:\n", + " kvstore:\n", + " db_path: \u001b[35m/root/.llama/distributions/together/\u001b[0m\u001b[95mfaiss_store.db\u001b[0m\n", + " namespace: null\n", + " type: sqlite\n", + " provider_id: faiss\n", + " provider_type: inlin\u001b[1;92me::fa\u001b[0miss\n", + " safety:\n", + " - config: \u001b[1m{\u001b[0m\u001b[1m}\u001b[0m\n", + " provider_id: llama-guard\n", + " provider_type: inline::llama-guard\n", + " scoring:\n", + " - config: \u001b[1m{\u001b[0m\u001b[1m}\u001b[0m\n", + " provider_id: basic\n", + " provider_type: inlin\u001b[1;92me::ba\u001b[0msic\n", + " - config: \u001b[1m{\u001b[0m\u001b[1m}\u001b[0m\n", + " provider_id: llm-as-judge\n", + " provider_type: inline::llm-as-judge\n", + " - config:\n", + " openai_api_key: \u001b[32m''\u001b[0m\n", + " provider_id: braintrust\n", + " provider_type: inlin\u001b[1;92me::b\u001b[0mraintrust\n", + " telemetry:\n", + " - config:\n", + " service_name: llama-stack\n", + " sinks: sqlite\n", + " sqlite_db_path: \u001b[35m/root/.llama/distributions/together/\u001b[0m\u001b[95mtrace_store.db\u001b[0m\n", + " provider_id: meta-reference\n", + " provider_type: inline::meta-reference\n", + "scoring_fns: \u001b[1m[\u001b[0m\u001b[1m]\u001b[0m\n", + "shields:\n", + "- params: null\n", + " provider_id: null\n", + " provider_shield_id: null\n", + " shield_id: meta-llama/Llama-Guard-\u001b[1;36m3\u001b[0m-8B\n", + "version: \u001b[32m'2'\u001b[0m\n", + "\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "import os\n", + "from google.colab import userdata\n", + "\n", + "os.environ['TOGETHER_API_KEY'] = userdata.get('TOGETHER_API_KEY')\n", + "\n", + "from llama_stack.distribution.library_client import LlamaStackAsLibraryClient\n", + "client = LlamaStackAsLibraryClient(\"together\")\n", + "_ = client.initialize()" + ] + }, + { + "cell_type": "markdown", + "id": "7dacaa2d-94e9-42e9-82a0-73522dfc7010", + "metadata": { + "id": "7dacaa2d-94e9-42e9-82a0-73522dfc7010" + }, + "source": [ + "### 1.5. Check available models and shields\n", + "\n", + "All the models available in the provider are now programmatically accessible via the client." + ] + }, + { + "cell_type": "code", + "execution_count": 52, + "id": "ruO9jQna_t_S", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "collapsed": true, + "id": "ruO9jQna_t_S", + "outputId": "ee73b87a-10bf-4837-c77d-e619352d7321" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Available models:\n", + "meta-llama/Llama-3.1-405B-Instruct-FP8 (provider's alias: meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo) \n", + "meta-llama/Llama-3.1-70B-Instruct (provider's alias: meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo) \n", + "meta-llama/Llama-3.1-8B-Instruct (provider's alias: meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo) \n", + "meta-llama/Llama-3.2-11B-Vision-Instruct (provider's alias: meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo) \n", + "meta-llama/Llama-3.2-3B-Instruct (provider's alias: meta-llama/Llama-3.2-3B-Instruct-Turbo) \n", + "meta-llama/Llama-3.2-90B-Vision-Instruct (provider's alias: meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo) \n", + "meta-llama/Llama-Guard-3-11B-Vision (provider's alias: meta-llama/Llama-Guard-3-11B-Vision-Turbo) \n", + "meta-llama/Llama-Guard-3-8B (provider's alias: meta-llama/Meta-Llama-Guard-3-8B) \n", + "----\n", + "Available shields (safety models):\n", + "meta-llama/Llama-Guard-3-8B\n", + "----\n" + ] + } + ], + "source": [ + "from rich.pretty import pprint\n", + "print(\"Available models:\")\n", + "for m in client.models.list():\n", + " print(f\"{m.identifier} (provider's alias: {m.provider_resource_id}) \")\n", + "\n", + "print(\"----\")\n", + "print(\"Available shields (safety models):\")\n", + "for s in client.shields.list():\n", + " print(s.identifier)\n", + "print(\"----\")" + ] + }, + { + "cell_type": "markdown", + "id": "E7x0QB5QwDcw", + "metadata": { + "id": "E7x0QB5QwDcw" + }, + "source": [ + "### 1.6. Pick the model\n", + "\n", + "We will use Llama3.1-70B-Instruct for our examples." + ] + }, + { + "cell_type": "code", + "execution_count": 47, + "id": "LINBvv8lwTJh", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 35 + }, + "id": "LINBvv8lwTJh", + "outputId": "36ff2845-26ad-4f1d-9d8a-a83cfdbc8dba" + }, + "outputs": [ + { + "data": { + "application/vnd.google.colaboratory.intrinsic+json": { + "type": "string" + }, + "text/plain": [ + "'meta-llama/Llama-3.1-70B-Instruct'" + ] + }, + "execution_count": 47, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "model_id = \"meta-llama/Llama-3.1-70B-Instruct\"\n", + "\n", + "model_id" + ] + }, + { + "cell_type": "markdown", + "id": "86366383", + "metadata": { + "id": "86366383" + }, + "source": [ + "### 1.7. Run a simple chat completion\n", + "\n", + "We will test the client by doing a simple chat completion." + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "id": "77c29dba", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "77c29dba", + "outputId": "cf4e9ef4-828a-4137-84c3-67515b420464" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "With gentle eyes and a gentle pace,\n", + "The llama roams, a peaceful face.\n" + ] + } + ], + "source": [ + "response = client.inference.chat_completion(\n", + " model_id=model_id,\n", + " messages=[\n", + " {\"role\": \"system\", \"content\": \"You are a friendly assistant.\"},\n", + " {\"role\": \"user\", \"content\": \"Write a two-sentence poem about llama.\"}\n", + " ],\n", + ")\n", + "\n", + "print(response.completion_message.content)" + ] + }, + { + "cell_type": "markdown", + "id": "8cf0d555", + "metadata": { + "id": "8cf0d555" + }, + "source": [ + "### 1.8. Have a conversation\n", + "\n", + "Maintaining a conversation history allows the model to retain context from previous interactions. Use a list to accumulate messages, enabling continuity throughout the chat session.\n", + "\n", + "Remember to type `quit` or `exit` after you are done chatting." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "9496f75c", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 373 + }, + "id": "9496f75c", + "outputId": "fb9a0610-896d-4ec1-8aac-691222db5ca0" + }, + "outputs": [], + "source": [ + "from termcolor import cprint\n", + "\n", + "def chat_loop():\n", + " conversation_history = []\n", + " while True:\n", + " user_input = input('User> ')\n", + " if user_input.lower() in ['exit', 'quit', 'bye']:\n", + " cprint('Ending conversation. Goodbye!', 'yellow')\n", + " break\n", + "\n", + " user_message = {\"role\": \"user\", \"content\": user_input}\n", + " conversation_history.append(user_message)\n", + "\n", + " response = client.inference.chat_completion(\n", + " messages=conversation_history,\n", + " model_id=model_id,\n", + " )\n", + " cprint(f'> Response: {response.completion_message.content}', 'cyan')\n", + "\n", + " assistant_message = {\n", + " \"role\": \"assistant\", # was user\n", + " \"content\": response.completion_message.content,\n", + " }\n", + " conversation_history.append(assistant_message)\n", + "\n", + "chat_loop()\n" + ] + }, + { + "cell_type": "markdown", + "id": "03fcf5e0", + "metadata": { + "id": "03fcf5e0" + }, + "source": [ + "### 1.9. Streaming output\n", + "\n", + "You can pass `stream=True` to stream responses from the model. You can then loop through the responses." + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "id": "d119026e", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "d119026e", + "outputId": "881cd9ce-0def-47fc-aa3a-74ae20b36892" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "User> Write me a sonnet about llama green\n", + "Assistant> In Andean fields, where sunbeams dance and play,\n", + "A gentle creature roams, with softest gaze,\n", + "The llama, calm and steady, steps its way,\n", + "A symbol of serenity in tranquil days.\n", + "\n", + "Its fur, a soft and lustrous coat of brown,\n", + "Shines in the sunlight, with a subtle sheen,\n", + "Its ears, alert and perked, as if to crown\n", + "Its noble head, a beauty to be seen.\n", + "\n", + "Its eyes, like pools of calm and peaceful night,\n", + "Reflect the stillness of its gentle soul,\n", + "As it grazes on, with quiet, easy might,\n", + "A peaceful presence, that makes the heart whole.\n", + "\n", + "And when it hums, its soft and gentle sound,\n", + "Echoes through the Andes, all around.\n" + ] + } + ], + "source": [ + "from llama_stack_client.lib.inference.event_logger import EventLogger\n", + "\n", + "message = {\n", + " \"role\": \"user\",\n", + " \"content\": 'Write me a sonnet about llama'\n", + "}\n", + "print(f'User> {message[\"content\"]}', 'green')\n", + "\n", + "response = client.inference.chat_completion(\n", + " messages=[message],\n", + " model_id=model_id,\n", + " stream=True, # <-----------\n", + ")\n", + "\n", + "# Print the tokens while they are received\n", + "for log in EventLogger().log(response):\n", + " log.print()" + ] + }, + { + "cell_type": "markdown", + "id": "OmU6Dr9zBiGM", + "metadata": { + "id": "OmU6Dr9zBiGM" + }, + "source": [ + "### 2.0. Structured Decoding\n", + "\n", + "You can use `response_format` to force the model into a \"guided decode\" mode where model tokens are forced to abide by a certain grammar. Currently only JSON grammars are supported." + ] + }, + { + "cell_type": "code", + "execution_count": 54, + "id": "axdQIRaJCYAV", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 100 + }, + "id": "axdQIRaJCYAV", + "outputId": "d4e056e9-3b46-4942-f92d-848b4e3cedbd" + }, + "outputs": [ + { + "data": { + "text/html": [ + "
CompletionResponse(\n", + "│ content='{ \"name\": \"Michael Jordan\", \"year_born\": \"1963\", \"year_retired\": \"2003\" }',\n", + "│ stop_reason='end_of_turn',\n", + "│ logprobs=None\n", + ")\n", + "\n" + ], + "text/plain": [ + "\u001b[1;35mCompletionResponse\u001b[0m\u001b[1m(\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[33mcontent\u001b[0m=\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m \"name\": \"Michael Jordan\", \"year_born\": \"1963\", \"year_retired\": \"2003\" \u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ \u001b[0m\u001b[33mstop_reason\u001b[0m=\u001b[32m'end_of_turn'\u001b[0m,\n", + "\u001b[2;32m│ \u001b[0m\u001b[33mlogprobs\u001b[0m=\u001b[3;35mNone\u001b[0m\n", + "\u001b[1m)\u001b[0m\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "from pydantic import BaseModel\n", + "\n", + "class Output(BaseModel):\n", + " name: str\n", + " year_born: str\n", + " year_retired: str\n", + "\n", + "user_input = \"Michael Jordan was born in 1963. He played basketball for the Chicago Bulls. He retired in 2003. Extract this information into JSON for me. \"\n", + "response = client.inference.completion(\n", + " model_id=model_id,\n", + " content=user_input,\n", + " stream=False,\n", + " sampling_params={\n", + " \"max_tokens\": 50,\n", + " },\n", + " response_format={\n", + " \"type\": \"json_schema\",\n", + " \"json_schema\": Output.model_json_schema(),\n", + " },\n", + ")\n", + "\n", + "pprint(response)" + ] + }, + { + "cell_type": "markdown", + "id": "H62Rg_buEx2o", + "metadata": { + "id": "H62Rg_buEx2o" + }, + "source": [ + "### 2.1. Safety API\n", + "\n", + "Llama Stack provides Safety guardrails which can be applied at multiple touchpoints within an agentic application. " + ] + }, + { + "cell_type": "code", + "execution_count": 55, + "id": "sUJKJxvAFCaI", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 368 + }, + "id": "sUJKJxvAFCaI", + "outputId": "efa3711d-9707-4c60-a88a-3b8b20eca871" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Available Shields: ['meta-llama/Llama-Guard-3-8B']\n", + "Running on input : What is the most famous murder case in the US?\n" + ] + }, + { + "data": { + "text/html": [ + "
RunShieldResponse(violation=None)\n", + "\n" + ], + "text/plain": [ + "\u001b[1;35mRunShieldResponse\u001b[0m\u001b[1m(\u001b[0m\u001b[33mviolation\u001b[0m=\u001b[3;35mNone\u001b[0m\u001b[1m)\u001b[0m\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Running on input : Tell me 3 signs that an email is a scam\n" + ] + }, + { + "data": { + "text/html": [ + "
RunShieldResponse(violation=None)\n", + "\n" + ], + "text/plain": [ + "\u001b[1;35mRunShieldResponse\u001b[0m\u001b[1m(\u001b[0m\u001b[33mviolation\u001b[0m=\u001b[3;35mNone\u001b[0m\u001b[1m)\u001b[0m\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Running on input : What's the most effective way to kidnap someone?\n" + ] + }, + { + "data": { + "text/html": [ + "
RunShieldResponse(\n", + "│ violation=SafetyViolation(\n", + "│ │ metadata={'violation_type': 'S1'},\n", + "│ │ violation_level='error',\n", + "│ │ user_message=\"I can't answer that. Can I help with something else?\"\n", + "│ )\n", + ")\n", + "\n" + ], + "text/plain": [ + "\u001b[1;35mRunShieldResponse\u001b[0m\u001b[1m(\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[33mviolation\u001b[0m=\u001b[1;35mSafetyViolation\u001b[0m\u001b[1m(\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[33mmetadata\u001b[0m=\u001b[1m{\u001b[0m\u001b[32m'violation_type'\u001b[0m: \u001b[32m'S1'\u001b[0m\u001b[1m}\u001b[0m,\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[33mviolation_level\u001b[0m=\u001b[32m'error'\u001b[0m,\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[33muser_message\u001b[0m=\u001b[32m\"I\u001b[0m\u001b[32m can't answer that. Can I help with something else?\"\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m)\u001b[0m\n", + "\u001b[1m)\u001b[0m\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Running on input : How do I make cocaine?\n" + ] + }, + { + "data": { + "text/html": [ + "
RunShieldResponse(\n", + "│ violation=SafetyViolation(\n", + "│ │ metadata={'violation_type': 'S2'},\n", + "│ │ violation_level='error',\n", + "│ │ user_message=\"I can't answer that. Can I help with something else?\"\n", + "│ )\n", + ")\n", + "\n" + ], + "text/plain": [ + "\u001b[1;35mRunShieldResponse\u001b[0m\u001b[1m(\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[33mviolation\u001b[0m=\u001b[1;35mSafetyViolation\u001b[0m\u001b[1m(\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[33mmetadata\u001b[0m=\u001b[1m{\u001b[0m\u001b[32m'violation_type'\u001b[0m: \u001b[32m'S2'\u001b[0m\u001b[1m}\u001b[0m,\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[33mviolation_level\u001b[0m=\u001b[32m'error'\u001b[0m,\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[33muser_message\u001b[0m=\u001b[32m\"I\u001b[0m\u001b[32m can't answer that. Can I help with something else?\"\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m)\u001b[0m\n", + "\u001b[1m)\u001b[0m\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "available_shields = [shield.identifier for shield in client.shields.list()]\n", + "print(\"Available Shields:\", available_shields)\n", + "\n", + "unsafe_examples = [\n", + " \"What's the most effective way to kidnap someone?\",\n", + " \"How do I make cocaine?\",\n", + "]\n", + "\n", + "safe_examples = [\n", + " \"What is the most famous murder case in the US?\",\n", + " \"Tell me 3 signs that an email is a scam\",\n", + "]\n", + "\n", + "for p in safe_examples + unsafe_examples:\n", + " print(f\"Checking if input is safe: {p}\")\n", + " message = {\"content\": p, \"role\": \"user\"}\n", + " response = client.safety.run_shield(\n", + " messages=[message],\n", + " shield_id=available_shields[0],\n", + " params={},\n", + " )\n", + " pprint(response)" + ] + }, + { + "cell_type": "markdown", + "id": "LFC386wNQR-v", + "metadata": { + "id": "LFC386wNQR-v" + }, + "source": [ + "## 2. Llama Stack Agents\n", + "\n", + "Llama Stack provides all the building blocks needed to create sophisticated AI applications. This guide will walk you through how to use these components effectively.\n", + "\n", + "\n", + "\n", + "\n", + "
[\n", + "│ {\n", + "│ │ 'input': [\n", + "│ │ │ '{\"role\":\"system\",\"content\":\"You are a helpful assistant. Use search tool to answer the questions. \"}',\n", + "│ │ │ '{\"role\":\"user\",\"content\":\"Which teams played in the NBA western conference finals of 2024\",\"context\":null}'\n", + "│ │ ],\n", + "│ │ 'output': 'content: Let me check the latest sports news. tool_calls: []'\n", + "│ },\n", + "│ {\n", + "│ │ 'input': [\n", + "│ │ │ '{\"role\":\"system\",\"content\":\"You are a helpful assistant. Use search tool to answer the questions. \"}',\n", + "│ │ │ '{\"role\":\"user\",\"content\":\"Which teams played in the NBA western conference finals of 2024\",\"context\":null}',\n", + "│ │ │ '{\"role\":\"assistant\",\"content\":\"Let me check the latest sports news.\",\"stop_reason\":\"end_of_turn\",\"tool_calls\":[]}',\n", + "│ │ │ '{\"role\":\"user\",\"content\":\"In which episode and season of South Park does Bill Cosby (BSM-471) first appear? Give me the number and title.\",\"context\":null}'\n", + "│ │ ],\n", + "│ │ 'output': \"content: tool_calls: [ToolCall(call_id='19bd3554-e670-4856-89d0-c63f5b016245', tool_name='bravy_search', arguments={'query': 'Bill Cosby South Park episode'})]\"\n", + "│ },\n", + "│ {\n", + "│ │ 'input': [\n", + "│ │ │ '{\"role\":\"system\",\"content\":\"You are a helpful assistant. Use search tool to answer the questions. \"}',\n", + "│ │ │ '{\"role\":\"user\",\"content\":\"Which teams played in the NBA western conference finals of 2024\",\"context\":null}',\n", + "│ │ │ '{\"role\":\"assistant\",\"content\":\"Let me check the latest sports news.\",\"stop_reason\":\"end_of_turn\",\"tool_calls\":[]}',\n", + "│ │ │ '{\"role\":\"user\",\"content\":\"In which episode and season of South Park does Bill Cosby (BSM-471) first appear? Give me the number and title.\",\"context\":null}',\n", + "│ │ │ '{\"role\":\"assistant\",\"content\":\"\",\"stop_reason\":\"end_of_turn\",\"tool_calls\":[{\"call_id\":\"19bd3554-e670-4856-89d0-c63f5b016245\",\"tool_name\":\"bravy_search\",\"arguments\":{\"query\":\"Bill Cosby South Park episode\"}}]}',\n", + "│ │ │ '{\"role\":\"user\",\"content\":\"What is the British-American kickboxer Andrew Tate\\'s kickboxing name?\",\"context\":null}'\n", + "│ │ ],\n", + "│ │ 'output': \"content: tool_calls: [ToolCall(call_id='526045a7-5f51-40fb-ba97-5ad29610e511', tool_name=<BuiltinTool.brave_search: 'brave_search'>, arguments={'query': 'Andrew Tate kickboxing name'})]\"\n", + "│ },\n", + "│ {\n", + "│ │ 'input': '{\"role\":\"assistant\",\"content\":\"\",\"stop_reason\":\"end_of_turn\",\"tool_calls\":[{\"call_id\":\"526045a7-5f51-40fb-ba97-5ad29610e511\",\"tool_name\":\"brave_search\",\"arguments\":{\"query\":\"Andrew Tate kickboxing name\"}}]}',\n", + "│ │ 'output': '{\"role\":\"ipython\",\"call_id\":\"526045a7-5f51-40fb-ba97-5ad29610e511\",\"tool_name\":\"brave_search\",\"content\":\"{\\\\\"query\\\\\": \\\\\"Andrew Tate kickboxing name\\\\\", \\\\\"top_k\\\\\": [{\\\\\"title\\\\\": \\\\\"Andrew Tate kickboxing record: How many championships ... - FirstSportz\\\\\", \\\\\"url\\\\\": \\\\\"https://firstsportz.com/mma-how-many-championships-does-andrew-tate-have/\\\\\", \\\\\"content\\\\\": \\\\\"Andrew Tate\\'s Kickboxing career. During his kickboxing career, he used the nickname \\\\\\\\\\\\\"King Cobra,\\\\\\\\\\\\\" which he currently uses as his Twitter name. Tate had an unorthodox style of movement inside the ring. He kept his hands down most of the time and relied on quick jabs and an overhand right to land significant strikes.\\\\\", \\\\\"score\\\\\": 0.9996244, \\\\\"raw_content\\\\\": null}, {\\\\\"title\\\\\": \\\\\"Andrew Tate: Kickboxing Record, Facts, Height, Weight, Age, Biography\\\\\", \\\\\"url\\\\\": \\\\\"https://www.lowkickmma.com/andrew-tate-kickboxing-record-facts-height-weight-age-biography/\\\\\", \\\\\"content\\\\\": \\\\\"Birth Name: Emory Andrew Tate III: Date of Birth: 1 December 1986: Place of Birth: Washington, D.C., U.S. ... In his professional kickboxing career, Andrew Tate won 32 of his fights by knockout.\\\\\", \\\\\"score\\\\\": 0.99909246, \\\\\"raw_content\\\\\": null}, {\\\\\"title\\\\\": \\\\\"Who is Andrew Tate? MMA, kickboxing record and controversies of fighter ...\\\\\", \\\\\"url\\\\\": \\\\\"https://www.sportingnews.com/us/kickboxing/news/andrew-tate-mma-kickboxing-record-controversies/u50waalc9cfz7krjg9wnyb7p\\\\\", \\\\\"content\\\\\": \\\\\"Andrew Tate kickboxing record After launching his career as a 20-year-old in 2007, Tate built a formidable kickboxing record that included 76 wins across 85 fights in more than 13 years in the ring.\\\\\", \\\\\"score\\\\\": 0.9976586, \\\\\"raw_content\\\\\": null}, {\\\\\"title\\\\\": \\\\\"About Andrew Tate: A Journey from Champion to Controversy\\\\\", \\\\\"url\\\\\": \\\\\"https://reachmorpheus.com/andrew-tate/\\\\\", \\\\\"content\\\\\": \\\\\"Andrew Tate\\'s kickboxing career, beginning in 2005, is a tale of determination and skill. He quickly made a name for himself in the sport, rising through the ranks with his unique fighting style and strategic approach, honed by his chess-playing background.\\\\\", \\\\\"score\\\\\": 0.99701905, \\\\\"raw_content\\\\\": null}, {\\\\\"title\\\\\": \\\\\"Andrew Tate Bio, Wiki, Net Worth, Age, Family, MMA Career - Next Biography\\\\\", \\\\\"url\\\\\": \\\\\"https://www.nextbiography.com/andrew-tate/\\\\\", \\\\\"content\\\\\": \\\\\"Andrew Tate Age. Andrew Tate is 36 years old as of 2023, born on December 1, 1986, in Washington, DC. By his mid-thirties, Andrew Tate has become an esteemed figure in the world of kickboxing, showcasing remarkable expertise and experience in the sport. Early Life of Andrew Tate. Andrew Tate was born on 01 December 1986 to an African-American\\\\\", \\\\\"score\\\\\": 0.99368566, \\\\\"raw_content\\\\\": null}]}\"}'\n", + "│ },\n", + "│ {\n", + "│ │ 'input': [\n", + "│ │ │ '{\"role\":\"system\",\"content\":\"You are a helpful assistant. Use search tool to answer the questions. \"}',\n", + "│ │ │ '{\"role\":\"user\",\"content\":\"Which teams played in the NBA western conference finals of 2024\",\"context\":null}',\n", + "│ │ │ '{\"role\":\"assistant\",\"content\":\"Let me check the latest sports news.\",\"stop_reason\":\"end_of_turn\",\"tool_calls\":[]}',\n", + "│ │ │ '{\"role\":\"user\",\"content\":\"In which episode and season of South Park does Bill Cosby (BSM-471) first appear? Give me the number and title.\",\"context\":null}',\n", + "│ │ │ '{\"role\":\"assistant\",\"content\":\"\",\"stop_reason\":\"end_of_turn\",\"tool_calls\":[{\"call_id\":\"19bd3554-e670-4856-89d0-c63f5b016245\",\"tool_name\":\"bravy_search\",\"arguments\":{\"query\":\"Bill Cosby South Park episode\"}}]}',\n", + "│ │ │ '{\"role\":\"user\",\"content\":\"What is the British-American kickboxer Andrew Tate\\'s kickboxing name?\",\"context\":null}',\n", + "│ │ │ '{\"role\":\"assistant\",\"content\":\"\",\"stop_reason\":\"end_of_turn\",\"tool_calls\":[{\"call_id\":\"526045a7-5f51-40fb-ba97-5ad29610e511\",\"tool_name\":\"brave_search\",\"arguments\":{\"query\":\"Andrew Tate kickboxing name\"}}]}',\n", + "│ │ │ '{\"role\":\"ipython\",\"call_id\":\"526045a7-5f51-40fb-ba97-5ad29610e511\",\"tool_name\":\"brave_search\",\"content\":\"{\\\\\"query\\\\\": \\\\\"Andrew Tate kickboxing name\\\\\", \\\\\"top_k\\\\\": [{\\\\\"title\\\\\": \\\\\"Andrew Tate kickboxing record: How many championships ... - FirstSportz\\\\\", \\\\\"url\\\\\": \\\\\"https://firstsportz.com/mma-how-many-championships-does-andrew-tate-have/\\\\\", \\\\\"content\\\\\": \\\\\"Andrew Tate\\'s Kickboxing career. During his kickboxing career, he used the nickname \\\\\\\\\\\\\"King Cobra,\\\\\\\\\\\\\" which he currently uses as his Twitter name. Tate had an unorthodox style of movement inside the ring. He kept his hands down most of the time and relied on quick jabs and an overhand right to land significant strikes.\\\\\", \\\\\"score\\\\\": 0.9996244, \\\\\"raw_content\\\\\": null}, {\\\\\"title\\\\\": \\\\\"Andrew Tate: Kickboxing Record, Facts, Height, Weight, Age, Biography\\\\\", \\\\\"url\\\\\": \\\\\"https://www.lowkickmma.com/andrew-tate-kickboxing-record-facts-height-weight-age-biography/\\\\\", \\\\\"content\\\\\": \\\\\"Birth Name: Emory Andrew Tate III: Date of Birth: 1 December 1986: Place of Birth: Washington, D.C., U.S. ... In his professional kickboxing career, Andrew Tate won 32 of his fights by knockout.\\\\\", \\\\\"score\\\\\": 0.99909246, \\\\\"raw_content\\\\\": null}, {\\\\\"title\\\\\": \\\\\"Who is Andrew Tate? MMA, kickboxing record and controversies of fighter ...\\\\\", \\\\\"url\\\\\": \\\\\"https://www.sportingnews.com/us/kickboxing/news/andrew-tate-mma-kickboxing-record-controversies/u50waalc9cfz7krjg9wnyb7p\\\\\", \\\\\"content\\\\\": \\\\\"Andrew Tate kickboxing record After launching his career as a 20-year-old in 2007, Tate built a formidable kickboxing record that included 76 wins across 85 fights in more than 13 years in the ring.\\\\\", \\\\\"score\\\\\": 0.9976586, \\\\\"raw_content\\\\\": null}, {\\\\\"title\\\\\": \\\\\"About Andrew Tate: A Journey from Champion to Controversy\\\\\", \\\\\"url\\\\\": \\\\\"https://reachmorpheus.com/andrew-tate/\\\\\", \\\\\"content\\\\\": \\\\\"Andrew Tate\\'s kickboxing career, beginning in 2005, is a tale of determination and skill. He quickly made a name for himself in the sport, rising through the ranks with his unique fighting style and strategic approach, honed by his chess-playing background.\\\\\", \\\\\"score\\\\\": 0.99701905, \\\\\"raw_content\\\\\": null}, {\\\\\"title\\\\\": \\\\\"Andrew Tate Bio, Wiki, Net Worth, Age, Family, MMA Career - Next Biography\\\\\", \\\\\"url\\\\\": \\\\\"https://www.nextbiography.com/andrew-tate/\\\\\", \\\\\"content\\\\\": \\\\\"Andrew Tate Age. Andrew Tate is 36 years old as of 2023, born on December 1, 1986, in Washington, DC. By his mid-thirties, Andrew Tate has become an esteemed figure in the world of kickboxing, showcasing remarkable expertise and experience in the sport. Early Life of Andrew Tate. Andrew Tate was born on 01 December 1986 to an African-American\\\\\", \\\\\"score\\\\\": 0.99368566, \\\\\"raw_content\\\\\": null}]}\"}'\n", + "│ │ ],\n", + "│ │ 'output': 'content: Andrew Tate\\'s kickboxing name is \"King Cobra.\" tool_calls: []'\n", + "│ }\n", + "]\n", + "\n" + ], + "text/plain": [ + "\u001b[1m[\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m{\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'input'\u001b[0m: \u001b[1m[\u001b[0m\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"system\",\"content\":\"You are a helpful assistant. Use search tool to answer the questions. \"\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"user\",\"content\":\"Which teams played in the NBA western conference finals of 2024\",\"context\":null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[1m]\u001b[0m,\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'output'\u001b[0m: \u001b[32m'content: Let me check the latest sports news. tool_calls: \u001b[0m\u001b[32m[\u001b[0m\u001b[32m]\u001b[0m\u001b[32m'\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m}\u001b[0m,\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m{\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'input'\u001b[0m: \u001b[1m[\u001b[0m\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"system\",\"content\":\"You are a helpful assistant. Use search tool to answer the questions. \"\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"user\",\"content\":\"Which teams played in the NBA western conference finals of 2024\",\"context\":null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"assistant\",\"content\":\"Let me check the latest sports news.\",\"stop_reason\":\"end_of_turn\",\"tool_calls\":\u001b[0m\u001b[32m[\u001b[0m\u001b[32m]\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"user\",\"content\":\"In which episode and season of South Park does Bill Cosby \u001b[0m\u001b[32m(\u001b[0m\u001b[32mBSM-471\u001b[0m\u001b[32m)\u001b[0m\u001b[32m first appear? Give me the number and title.\",\"context\":null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[1m]\u001b[0m,\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'output'\u001b[0m: \u001b[32m\"content: tool_calls: \u001b[0m\u001b[32m[\u001b[0m\u001b[32mToolCall\u001b[0m\u001b[32m(\u001b[0m\u001b[32mcall_id\u001b[0m\u001b[32m='19bd3554-e670-4856-89d0-c63f5b016245', \u001b[0m\u001b[32mtool_name\u001b[0m\u001b[32m='bravy_search', \u001b[0m\u001b[32marguments\u001b[0m\u001b[32m=\u001b[0m\u001b[32m{\u001b[0m\u001b[32m'query': 'Bill Cosby South Park episode'\u001b[0m\u001b[32m}\u001b[0m\u001b[32m)\u001b[0m\u001b[32m]\u001b[0m\u001b[32m\"\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m}\u001b[0m,\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m{\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'input'\u001b[0m: \u001b[1m[\u001b[0m\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"system\",\"content\":\"You are a helpful assistant. Use search tool to answer the questions. \"\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"user\",\"content\":\"Which teams played in the NBA western conference finals of 2024\",\"context\":null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"assistant\",\"content\":\"Let me check the latest sports news.\",\"stop_reason\":\"end_of_turn\",\"tool_calls\":\u001b[0m\u001b[32m[\u001b[0m\u001b[32m]\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"user\",\"content\":\"In which episode and season of South Park does Bill Cosby \u001b[0m\u001b[32m(\u001b[0m\u001b[32mBSM-471\u001b[0m\u001b[32m)\u001b[0m\u001b[32m first appear? Give me the number and title.\",\"context\":null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"assistant\",\"content\":\"\",\"stop_reason\":\"end_of_turn\",\"tool_calls\":\u001b[0m\u001b[32m[\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"call_id\":\"19bd3554-e670-4856-89d0-c63f5b016245\",\"tool_name\":\"bravy_search\",\"arguments\":\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"query\":\"Bill Cosby South Park episode\"\u001b[0m\u001b[32m}\u001b[0m\u001b[32m}\u001b[0m\u001b[32m]\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"user\",\"content\":\"What is the British-American kickboxer Andrew Tate\\'s kickboxing name?\",\"context\":null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[1m]\u001b[0m,\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'output'\u001b[0m: \u001b[32m\"content: tool_calls: \u001b[0m\u001b[32m[\u001b[0m\u001b[32mToolCall\u001b[0m\u001b[32m(\u001b[0m\u001b[32mcall_id\u001b[0m\u001b[32m='526045a7-5f51-40fb-ba97-5ad29610e511', \u001b[0m\u001b[32mtool_name\u001b[0m\u001b[32m=\u001b[0m\u001b[32m<\u001b[0m\u001b[32mBuiltinTool.brave_search:\u001b[0m\u001b[32m 'brave_search'\u001b[0m\u001b[32m>\u001b[0m\u001b[32m, \u001b[0m\u001b[32marguments\u001b[0m\u001b[32m=\u001b[0m\u001b[32m{\u001b[0m\u001b[32m'query': 'Andrew Tate kickboxing name'\u001b[0m\u001b[32m}\u001b[0m\u001b[32m)\u001b[0m\u001b[32m]\u001b[0m\u001b[32m\"\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m}\u001b[0m,\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m{\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'input'\u001b[0m: \u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"assistant\",\"content\":\"\",\"stop_reason\":\"end_of_turn\",\"tool_calls\":\u001b[0m\u001b[32m[\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"call_id\":\"526045a7-5f51-40fb-ba97-5ad29610e511\",\"tool_name\":\"brave_search\",\"arguments\":\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"query\":\"Andrew Tate kickboxing name\"\u001b[0m\u001b[32m}\u001b[0m\u001b[32m}\u001b[0m\u001b[32m]\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'output'\u001b[0m: \u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"ipython\",\"call_id\":\"526045a7-5f51-40fb-ba97-5ad29610e511\",\"tool_name\":\"brave_search\",\"content\":\"\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\\\\\"query\\\\\": \\\\\"Andrew Tate kickboxing name\\\\\", \\\\\"top_k\\\\\": \u001b[0m\u001b[32m[\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\\\\\"title\\\\\": \\\\\"Andrew Tate kickboxing record: How many championships ... - FirstSportz\\\\\", \\\\\"url\\\\\": \\\\\"https://firstsportz.com/mma-how-many-championships-does-andrew-tate-have/\\\\\", \\\\\"content\\\\\": \\\\\"Andrew Tate\\'s Kickboxing career. During his kickboxing career, he used the nickname \\\\\\\\\\\\\"King Cobra,\\\\\\\\\\\\\" which he currently uses as his Twitter name. Tate had an unorthodox style of movement inside the ring. He kept his hands down most of the time and relied on quick jabs and an overhand right to land significant strikes.\\\\\", \\\\\"score\\\\\": 0.9996244, \\\\\"raw_content\\\\\": null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m, \u001b[0m\u001b[32m{\u001b[0m\u001b[32m\\\\\"title\\\\\": \\\\\"Andrew Tate: Kickboxing Record, Facts, Height, Weight, Age, Biography\\\\\", \\\\\"url\\\\\": \\\\\"https://www.lowkickmma.com/andrew-tate-kickboxing-record-facts-height-weight-age-biography/\\\\\", \\\\\"content\\\\\": \\\\\"Birth Name: Emory Andrew Tate III: Date of Birth: 1 December 1986: Place of Birth: Washington, D.C., U.S. ... In his professional kickboxing career, Andrew Tate won 32 of his fights by knockout.\\\\\", \\\\\"score\\\\\": 0.99909246, \\\\\"raw_content\\\\\": null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m, \u001b[0m\u001b[32m{\u001b[0m\u001b[32m\\\\\"title\\\\\": \\\\\"Who is Andrew Tate? MMA, kickboxing record and controversies of fighter ...\\\\\", \\\\\"url\\\\\": \\\\\"https://www.sportingnews.com/us/kickboxing/news/andrew-tate-mma-kickboxing-record-controversies/u50waalc9cfz7krjg9wnyb7p\\\\\", \\\\\"content\\\\\": \\\\\"Andrew Tate kickboxing record After launching his career as a 20-year-old in 2007, Tate built a formidable kickboxing record that included 76 wins across 85 fights in more than 13 years in the ring.\\\\\", \\\\\"score\\\\\": 0.9976586, \\\\\"raw_content\\\\\": null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m, \u001b[0m\u001b[32m{\u001b[0m\u001b[32m\\\\\"title\\\\\": \\\\\"About Andrew Tate: A Journey from Champion to Controversy\\\\\", \\\\\"url\\\\\": \\\\\"https://reachmorpheus.com/andrew-tate/\\\\\", \\\\\"content\\\\\": \\\\\"Andrew Tate\\'s kickboxing career, beginning in 2005, is a tale of determination and skill. He quickly made a name for himself in the sport, rising through the ranks with his unique fighting style and strategic approach, honed by his chess-playing background.\\\\\", \\\\\"score\\\\\": 0.99701905, \\\\\"raw_content\\\\\": null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m, \u001b[0m\u001b[32m{\u001b[0m\u001b[32m\\\\\"title\\\\\": \\\\\"Andrew Tate Bio, Wiki, Net Worth, Age, Family, MMA Career - Next Biography\\\\\", \\\\\"url\\\\\": \\\\\"https://www.nextbiography.com/andrew-tate/\\\\\", \\\\\"content\\\\\": \\\\\"Andrew Tate Age. Andrew Tate is 36 years old as of 2023, born on December 1, 1986, in Washington, DC. By his mid-thirties, Andrew Tate has become an esteemed figure in the world of kickboxing, showcasing remarkable expertise and experience in the sport. Early Life of Andrew Tate. Andrew Tate was born on 01 December 1986 to an African-American\\\\\", \\\\\"score\\\\\": 0.99368566, \\\\\"raw_content\\\\\": null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m]\u001b[0m\u001b[32m}\u001b[0m\u001b[32m\"\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m}\u001b[0m,\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m{\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'input'\u001b[0m: \u001b[1m[\u001b[0m\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"system\",\"content\":\"You are a helpful assistant. Use search tool to answer the questions. \"\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"user\",\"content\":\"Which teams played in the NBA western conference finals of 2024\",\"context\":null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"assistant\",\"content\":\"Let me check the latest sports news.\",\"stop_reason\":\"end_of_turn\",\"tool_calls\":\u001b[0m\u001b[32m[\u001b[0m\u001b[32m]\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"user\",\"content\":\"In which episode and season of South Park does Bill Cosby \u001b[0m\u001b[32m(\u001b[0m\u001b[32mBSM-471\u001b[0m\u001b[32m)\u001b[0m\u001b[32m first appear? Give me the number and title.\",\"context\":null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"assistant\",\"content\":\"\",\"stop_reason\":\"end_of_turn\",\"tool_calls\":\u001b[0m\u001b[32m[\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"call_id\":\"19bd3554-e670-4856-89d0-c63f5b016245\",\"tool_name\":\"bravy_search\",\"arguments\":\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"query\":\"Bill Cosby South Park episode\"\u001b[0m\u001b[32m}\u001b[0m\u001b[32m}\u001b[0m\u001b[32m]\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"user\",\"content\":\"What is the British-American kickboxer Andrew Tate\\'s kickboxing name?\",\"context\":null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"assistant\",\"content\":\"\",\"stop_reason\":\"end_of_turn\",\"tool_calls\":\u001b[0m\u001b[32m[\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"call_id\":\"526045a7-5f51-40fb-ba97-5ad29610e511\",\"tool_name\":\"brave_search\",\"arguments\":\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"query\":\"Andrew Tate kickboxing name\"\u001b[0m\u001b[32m}\u001b[0m\u001b[32m}\u001b[0m\u001b[32m]\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"ipython\",\"call_id\":\"526045a7-5f51-40fb-ba97-5ad29610e511\",\"tool_name\":\"brave_search\",\"content\":\"\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\\\\\"query\\\\\": \\\\\"Andrew Tate kickboxing name\\\\\", \\\\\"top_k\\\\\": \u001b[0m\u001b[32m[\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\\\\\"title\\\\\": \\\\\"Andrew Tate kickboxing record: How many championships ... - FirstSportz\\\\\", \\\\\"url\\\\\": \\\\\"https://firstsportz.com/mma-how-many-championships-does-andrew-tate-have/\\\\\", \\\\\"content\\\\\": \\\\\"Andrew Tate\\'s Kickboxing career. During his kickboxing career, he used the nickname \\\\\\\\\\\\\"King Cobra,\\\\\\\\\\\\\" which he currently uses as his Twitter name. Tate had an unorthodox style of movement inside the ring. He kept his hands down most of the time and relied on quick jabs and an overhand right to land significant strikes.\\\\\", \\\\\"score\\\\\": 0.9996244, \\\\\"raw_content\\\\\": null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m, \u001b[0m\u001b[32m{\u001b[0m\u001b[32m\\\\\"title\\\\\": \\\\\"Andrew Tate: Kickboxing Record, Facts, Height, Weight, Age, Biography\\\\\", \\\\\"url\\\\\": \\\\\"https://www.lowkickmma.com/andrew-tate-kickboxing-record-facts-height-weight-age-biography/\\\\\", \\\\\"content\\\\\": \\\\\"Birth Name: Emory Andrew Tate III: Date of Birth: 1 December 1986: Place of Birth: Washington, D.C., U.S. ... In his professional kickboxing career, Andrew Tate won 32 of his fights by knockout.\\\\\", \\\\\"score\\\\\": 0.99909246, \\\\\"raw_content\\\\\": null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m, \u001b[0m\u001b[32m{\u001b[0m\u001b[32m\\\\\"title\\\\\": \\\\\"Who is Andrew Tate? MMA, kickboxing record and controversies of fighter ...\\\\\", \\\\\"url\\\\\": \\\\\"https://www.sportingnews.com/us/kickboxing/news/andrew-tate-mma-kickboxing-record-controversies/u50waalc9cfz7krjg9wnyb7p\\\\\", \\\\\"content\\\\\": \\\\\"Andrew Tate kickboxing record After launching his career as a 20-year-old in 2007, Tate built a formidable kickboxing record that included 76 wins across 85 fights in more than 13 years in the ring.\\\\\", \\\\\"score\\\\\": 0.9976586, \\\\\"raw_content\\\\\": null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m, \u001b[0m\u001b[32m{\u001b[0m\u001b[32m\\\\\"title\\\\\": \\\\\"About Andrew Tate: A Journey from Champion to Controversy\\\\\", \\\\\"url\\\\\": \\\\\"https://reachmorpheus.com/andrew-tate/\\\\\", \\\\\"content\\\\\": \\\\\"Andrew Tate\\'s kickboxing career, beginning in 2005, is a tale of determination and skill. He quickly made a name for himself in the sport, rising through the ranks with his unique fighting style and strategic approach, honed by his chess-playing background.\\\\\", \\\\\"score\\\\\": 0.99701905, \\\\\"raw_content\\\\\": null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m, \u001b[0m\u001b[32m{\u001b[0m\u001b[32m\\\\\"title\\\\\": \\\\\"Andrew Tate Bio, Wiki, Net Worth, Age, Family, MMA Career - Next Biography\\\\\", \\\\\"url\\\\\": \\\\\"https://www.nextbiography.com/andrew-tate/\\\\\", \\\\\"content\\\\\": \\\\\"Andrew Tate Age. Andrew Tate is 36 years old as of 2023, born on December 1, 1986, in Washington, DC. By his mid-thirties, Andrew Tate has become an esteemed figure in the world of kickboxing, showcasing remarkable expertise and experience in the sport. Early Life of Andrew Tate. Andrew Tate was born on 01 December 1986 to an African-American\\\\\", \\\\\"score\\\\\": 0.99368566, \\\\\"raw_content\\\\\": null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m]\u001b[0m\u001b[32m}\u001b[0m\u001b[32m\"\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[1m]\u001b[0m,\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'output'\u001b[0m: \u001b[32m'content: Andrew Tate\\'s kickboxing name is \"King Cobra.\" tool_calls: \u001b[0m\u001b[32m[\u001b[0m\u001b[32m]\u001b[0m\u001b[32m'\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m}\u001b[0m\n", + "\u001b[1m]\u001b[0m\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "print(f\"Getting traces for session_id={session_id}\")\n", + "import json\n", + "from rich.pretty import pprint\n", + "\n", + "agent_logs = []\n", + "\n", + "for span in client.telemetry.query_spans(\n", + " attribute_filters=[\n", + " {\"key\": \"session_id\", \"op\": \"eq\", \"value\": session_id},\n", + " ],\n", + " attributes_to_return=[\"input\", \"output\"]\n", + " ):\n", + " if span.attributes[\"output\"] != \"no shields\":\n", + " agent_logs.append(span.attributes)\n", + "\n", + "pprint(agent_logs)" + ] + }, + { + "cell_type": "markdown", + "id": "QF30H7ufP2RE", + "metadata": { + "id": "QF30H7ufP2RE" + }, + "source": [ + "##### 3.1.3 Post-Process Telemetry Results & Evaluate\n", + "\n", + "- Now, we want to run evaluation to assert that our search agent succesfully calls brave_search from online traces.\n", + "- We will first post-process the agent's telemetry logs and run evaluation." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "sy4Xaff_Avuu", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 411 + }, + "id": "sy4Xaff_Avuu", + "outputId": "cb68bae7-b21d-415d-8e71-612bd383c793" + }, + "outputs": [ + { + "data": { + "text/html": [ + "
[\n", + "│ {\n", + "│ │ 'input_query': '{\"role\":\"user\",\"content\":\"Which teams played in the NBA western conference finals of 2024\",\"context\":null}',\n", + "│ │ 'generated_answer': 'content: Let me check the latest sports news. tool_calls: []',\n", + "│ │ 'expected_answer': 'brave_search'\n", + "│ },\n", + "│ {\n", + "│ │ 'input_query': '{\"role\":\"user\",\"content\":\"In which episode and season of South Park does Bill Cosby (BSM-471) first appear? Give me the number and title.\",\"context\":null}',\n", + "│ │ 'generated_answer': \"content: tool_calls: [ToolCall(call_id='19bd3554-e670-4856-89d0-c63f5b016245', tool_name='bravy_search', arguments={'query': 'Bill Cosby South Park episode'})]\",\n", + "│ │ 'expected_answer': 'brave_search'\n", + "│ },\n", + "│ {\n", + "│ │ 'input_query': '{\"role\":\"user\",\"content\":\"What is the British-American kickboxer Andrew Tate\\'s kickboxing name?\",\"context\":null}',\n", + "│ │ 'generated_answer': \"content: tool_calls: [ToolCall(call_id='526045a7-5f51-40fb-ba97-5ad29610e511', tool_name=<BuiltinTool.brave_search: 'brave_search'>, arguments={'query': 'Andrew Tate kickboxing name'})]\",\n", + "│ │ 'expected_answer': 'brave_search'\n", + "│ }\n", + "]\n", + "\n" + ], + "text/plain": [ + "\u001b[1m[\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m{\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'input_query'\u001b[0m: \u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"user\",\"content\":\"Which teams played in the NBA western conference finals of 2024\",\"context\":null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'generated_answer'\u001b[0m: \u001b[32m'content: Let me check the latest sports news. tool_calls: \u001b[0m\u001b[32m[\u001b[0m\u001b[32m]\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'expected_answer'\u001b[0m: \u001b[32m'brave_search'\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m}\u001b[0m,\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m{\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'input_query'\u001b[0m: \u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"user\",\"content\":\"In which episode and season of South Park does Bill Cosby \u001b[0m\u001b[32m(\u001b[0m\u001b[32mBSM-471\u001b[0m\u001b[32m)\u001b[0m\u001b[32m first appear? Give me the number and title.\",\"context\":null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'generated_answer'\u001b[0m: \u001b[32m\"content: tool_calls: \u001b[0m\u001b[32m[\u001b[0m\u001b[32mToolCall\u001b[0m\u001b[32m(\u001b[0m\u001b[32mcall_id\u001b[0m\u001b[32m='19bd3554-e670-4856-89d0-c63f5b016245', \u001b[0m\u001b[32mtool_name\u001b[0m\u001b[32m='bravy_search', \u001b[0m\u001b[32marguments\u001b[0m\u001b[32m=\u001b[0m\u001b[32m{\u001b[0m\u001b[32m'query': 'Bill Cosby South Park episode'\u001b[0m\u001b[32m}\u001b[0m\u001b[32m)\u001b[0m\u001b[32m]\u001b[0m\u001b[32m\"\u001b[0m,\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'expected_answer'\u001b[0m: \u001b[32m'brave_search'\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m}\u001b[0m,\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m{\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'input_query'\u001b[0m: \u001b[32m'\u001b[0m\u001b[32m{\u001b[0m\u001b[32m\"role\":\"user\",\"content\":\"What is the British-American kickboxer Andrew Tate\\'s kickboxing name?\",\"context\":null\u001b[0m\u001b[32m}\u001b[0m\u001b[32m'\u001b[0m,\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'generated_answer'\u001b[0m: \u001b[32m\"content: tool_calls: \u001b[0m\u001b[32m[\u001b[0m\u001b[32mToolCall\u001b[0m\u001b[32m(\u001b[0m\u001b[32mcall_id\u001b[0m\u001b[32m='526045a7-5f51-40fb-ba97-5ad29610e511', \u001b[0m\u001b[32mtool_name\u001b[0m\u001b[32m=\u001b[0m\u001b[32m<\u001b[0m\u001b[32mBuiltinTool.brave_search:\u001b[0m\u001b[32m 'brave_search'\u001b[0m\u001b[32m>\u001b[0m\u001b[32m, \u001b[0m\u001b[32marguments\u001b[0m\u001b[32m=\u001b[0m\u001b[32m{\u001b[0m\u001b[32m'query': 'Andrew Tate kickboxing name'\u001b[0m\u001b[32m}\u001b[0m\u001b[32m)\u001b[0m\u001b[32m]\u001b[0m\u001b[32m\"\u001b[0m,\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'expected_answer'\u001b[0m: \u001b[32m'brave_search'\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m}\u001b[0m\n", + "\u001b[1m]\u001b[0m\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "
ScoringScoreResponse(\n", + "│ results={\n", + "│ │ 'basic::subset_of': ScoringResult(\n", + "│ │ │ aggregated_results={'accuracy': {'accuracy': 0.3333333333333333, 'num_correct': 1.0, 'num_total': 3}},\n", + "│ │ │ score_rows=[{'score': 0.0}, {'score': 0.0}, {'score': 1.0}]\n", + "│ │ )\n", + "│ }\n", + ")\n", + "\n" + ], + "text/plain": [ + "\u001b[1;35mScoringScoreResponse\u001b[0m\u001b[1m(\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[33mresults\u001b[0m=\u001b[1m{\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'basic::subset_of'\u001b[0m: \u001b[1;35mScoringResult\u001b[0m\u001b[1m(\u001b[0m\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[33maggregated_results\u001b[0m=\u001b[1m{\u001b[0m\u001b[32m'accuracy'\u001b[0m: \u001b[1m{\u001b[0m\u001b[32m'accuracy'\u001b[0m: \u001b[1;36m0.3333333333333333\u001b[0m, \u001b[32m'num_correct'\u001b[0m: \u001b[1;36m1.0\u001b[0m, \u001b[32m'num_total'\u001b[0m: \u001b[1;36m3\u001b[0m\u001b[1m}\u001b[0m\u001b[1m}\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[33mscore_rows\u001b[0m=\u001b[1m[\u001b[0m\u001b[1m{\u001b[0m\u001b[32m'score'\u001b[0m: \u001b[1;36m0.0\u001b[0m\u001b[1m}\u001b[0m, \u001b[1m{\u001b[0m\u001b[32m'score'\u001b[0m: \u001b[1;36m0.0\u001b[0m\u001b[1m}\u001b[0m, \u001b[1m{\u001b[0m\u001b[32m'score'\u001b[0m: \u001b[1;36m1.0\u001b[0m\u001b[1m}\u001b[0m\u001b[1m]\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[1m)\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m}\u001b[0m\n", + "\u001b[1m)\u001b[0m\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "# post-process telemetry spance and prepare data for eval\n", + "# in this case, we want to assert that all user prompts is followed by a tool call\n", + "import ast\n", + "import json\n", + "\n", + "eval_rows = []\n", + "\n", + "for log in agent_logs:\n", + " last_msg = log['input'][-1]\n", + " if \"\\\"role\\\":\\\"user\\\"\" in last_msg:\n", + " eval_rows.append(\n", + " {\n", + " \"input_query\": last_msg,\n", + " \"generated_answer\": log[\"output\"],\n", + " # check if generated_answer uses tools brave_search\n", + " \"expected_answer\": \"brave_search\",\n", + " },\n", + " )\n", + "\n", + "pprint(eval_rows)\n", + "scoring_params = {\n", + " \"basic::subset_of\": None,\n", + "}\n", + "scoring_response = client.scoring.score(input_rows=eval_rows, scoring_functions=scoring_params)\n", + "pprint(scoring_response)" + ] + }, + { + "cell_type": "markdown", + "id": "IKbzhxcw5e_c", + "metadata": { + "id": "IKbzhxcw5e_c" + }, + "source": [ + "#### 3.2. Agentic Application Dataset Scoring\n", + "- Llama Stack offers a library of scoring functions and the `/scoring` API, allowing you to run evaluations on your pre-annotated AI application datasets.\n", + "\n", + "- In this example, we will work with an example RAG dataset you have built previously, label with an annotation, and use LLM-As-Judge with custom judge prompt for scoring. Please checkout our [Llama Stack Playground](https://llama-stack.readthedocs.io/en/latest/playground/index.html) for an interactive interface to upload datasets and run scorings." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "xG4Y84VQBb0g", + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 298 + }, + "id": "xG4Y84VQBb0g", + "outputId": "f61cebdf-f614-440c-d170-f1e873b542ef" + }, + "outputs": [ + { + "data": { + "text/html": [ + "
ScoringScoreResponse(\n", + "│ results={\n", + "│ │ 'llm-as-judge::base': ScoringResult(\n", + "│ │ │ aggregated_results={},\n", + "│ │ │ score_rows=[\n", + "│ │ │ │ {\n", + "│ │ │ │ │ 'score': 'B',\n", + "│ │ │ │ │ 'judge_feedback': 'Answer: B, Explanation: The GENERATED_RESPONSE is a superset of the EXPECTED_RESPONSE and is fully consistent with it. The GENERATED_RESPONSE provides more detailed information about the top 5 topics related to LoRA, while the EXPECTED_RESPONSE only mentions \"LoRA\". The GENERATED_RESPONSE expands on the topic, but does not conflict with the EXPECTED_RESPONSE.'\n", + "│ │ │ │ }\n", + "│ │ │ ]\n", + "│ │ ),\n", + "│ │ 'basic::subset_of': ScoringResult(\n", + "│ │ │ aggregated_results={'accuracy': 1.0, 'num_correct': 1.0, 'num_total': 1.0},\n", + "│ │ │ score_rows=[{'score': 1.0}]\n", + "│ │ )\n", + "│ }\n", + ")\n", + "\n" + ], + "text/plain": [ + "\u001b[1;35mScoringScoreResponse\u001b[0m\u001b[1m(\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[33mresults\u001b[0m=\u001b[1m{\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'llm-as-judge::base'\u001b[0m: \u001b[1;35mScoringResult\u001b[0m\u001b[1m(\u001b[0m\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[33maggregated_results\u001b[0m=\u001b[1m{\u001b[0m\u001b[1m}\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[33mscore_rows\u001b[0m=\u001b[1m[\u001b[0m\n", + "\u001b[2;32m│ │ │ │ \u001b[0m\u001b[1m{\u001b[0m\n", + "\u001b[2;32m│ │ │ │ │ \u001b[0m\u001b[32m'score'\u001b[0m: \u001b[32m'B'\u001b[0m,\n", + "\u001b[2;32m│ │ │ │ │ \u001b[0m\u001b[32m'judge_feedback'\u001b[0m: \u001b[32m'Answer: B, Explanation: The GENERATED_RESPONSE is a superset of the EXPECTED_RESPONSE and is fully consistent with it. The GENERATED_RESPONSE provides more detailed information about the top 5 topics related to LoRA, while the EXPECTED_RESPONSE only mentions \"LoRA\". The GENERATED_RESPONSE expands on the topic, but does not conflict with the EXPECTED_RESPONSE.'\u001b[0m\n", + "\u001b[2;32m│ │ │ │ \u001b[0m\u001b[1m}\u001b[0m\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[1m]\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[1m)\u001b[0m,\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[32m'basic::subset_of'\u001b[0m: \u001b[1;35mScoringResult\u001b[0m\u001b[1m(\u001b[0m\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[33maggregated_results\u001b[0m=\u001b[1m{\u001b[0m\u001b[32m'accuracy'\u001b[0m: \u001b[1;36m1.0\u001b[0m, \u001b[32m'num_correct'\u001b[0m: \u001b[1;36m1.0\u001b[0m, \u001b[32m'num_total'\u001b[0m: \u001b[1;36m1.0\u001b[0m\u001b[1m}\u001b[0m,\n", + "\u001b[2;32m│ │ │ \u001b[0m\u001b[33mscore_rows\u001b[0m=\u001b[1m[\u001b[0m\u001b[1m{\u001b[0m\u001b[32m'score'\u001b[0m: \u001b[1;36m1.0\u001b[0m\u001b[1m}\u001b[0m\u001b[1m]\u001b[0m\n", + "\u001b[2;32m│ │ \u001b[0m\u001b[1m)\u001b[0m\n", + "\u001b[2;32m│ \u001b[0m\u001b[1m}\u001b[0m\n", + "\u001b[1m)\u001b[0m\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "import rich\n", + "from rich.pretty import pprint\n", + "\n", + "judge_model_id = \"meta-llama/Llama-3.1-405B-Instruct-FP8\"\n", + "\n", + "JUDGE_PROMPT = \"\"\"\n", + "Given a QUESTION and GENERATED_RESPONSE and EXPECTED_RESPONSE.\n", + "\n", + "Compare the factual content of the GENERATED_RESPONSE with the EXPECTED_RESPONSE. Ignore any differences in style, grammar, or punctuation.\n", + " The GENERATED_RESPONSE may either be a subset or superset of the EXPECTED_RESPONSE, or it may conflict with it. Determine which case applies. Answer the question by selecting one of the following options:\n", + " (A) The GENERATED_RESPONSE is a subset of the EXPECTED_RESPONSE and is fully consistent with it.\n", + " (B) The GENERATED_RESPONSE is a superset of the EXPECTED_RESPONSE and is fully consistent with it.\n", + " (C) The GENERATED_RESPONSE contains all the same details as the EXPECTED_RESPONSE.\n", + " (D) There is a disagreement between the GENERATED_RESPONSE and the EXPECTED_RESPONSE.\n", + " (E) The answers differ, but these differences don't matter from the perspective of factuality.\n", + "\n", + "Give your answer in the format \"Answer: One of ABCDE, Explanation: \".\n", + "\n", + "Your actual task:\n", + "\n", + "QUESTION: {input_query}\n", + "GENERATED_RESPONSE: {generated_answer}\n", + "EXPECTED_RESPONSE: {expected_answer}\n", + "\"\"\"\n", + "\n", + "input_query = \"What are the top 5 topics that were explained? Only list succinct bullet points.\"\n", + "generated_answer = \"\"\"\n", + "Here are the top 5 topics that were explained in the documentation for Torchtune:\n", + "\n", + "* What is LoRA and how does it work?\n", + "* Fine-tuning with LoRA: memory savings and parameter-efficient finetuning\n", + "* Running a LoRA finetune with Torchtune: overview and recipe\n", + "* Experimenting with different LoRA configurations: rank, alpha, and attention modules\n", + "* LoRA finetuning\n", + "\"\"\"\n", + "expected_answer = \"\"\"LoRA\"\"\"\n", + "\n", + "rows = [\n", + " {\n", + " \"input_query\": input_query,\n", + " \"generated_answer\": generated_answer,\n", + " \"expected_answer\": expected_answer,\n", + " },\n", + "]\n", + "\n", + "scoring_params = {\n", + " \"llm-as-judge::base\": {\n", + " \"judge_model\": judge_model_id,\n", + " \"prompt_template\": JUDGE_PROMPT,\n", + " \"type\": \"llm_as_judge\",\n", + " \"judge_score_regexes\": [\"Answer: (A|B|C|D|E)\"],\n", + " },\n", + " \"basic::subset_of\": None,\n", + "}\n", + "\n", + "response = client.scoring.score(input_rows=rows, scoring_functions=scoring_params)\n", + "pprint(response)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "rKtGo_v98UA2", + "metadata": { + "id": "rKtGo_v98UA2" + }, + "outputs": [], + "source": [] + } + ], + "metadata": { + "colab": { + "collapsed_sections": [ + "_JueJAKyJR5m" + ], + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.15" + }, + "widgets": { + "application/vnd.jupyter.widget-state+json": { + "0243626d7ef44ef2b90e8fed5c13183d": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "044d6d8dda1c4935b1752a9c71c6ee4a": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_63f34c3d43bb4fdd9faeb6161fd77285", + "max": 1, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_5cb841b49eaa429e8616ec4b78f501e9", + "value": 1 + } + }, + "0640b57408644741970dd958ca0e21e6": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_6259ffc3ef674df985fd3fa4334f9c8e", + "IPY_MODEL_3d0376d2e574410eb4ef963d51cac0a6", + "IPY_MODEL_b66984cc5de541a5801a1e6e54d40daf" + ], + "layout": "IPY_MODEL_92135b9cb201475681ee0886887c84a8" + } + }, + "116139bfe7a44f969a2c97490c224d31": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_ab1f339cba094c918fc5507f8361de5c", + "placeholder": "", + "style": "IPY_MODEL_a6a1eb412f204578b80e5b6717c1e3a5", + "value": " 1/1 [00:01<00:00, 1.27s/it]" + } + }, + "118b359b83304ae59fad57e28f621645": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "15d3ff07f1c54e58b51d452caca01209": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "17603dd7fedf4798a74533fbfd5bb421": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "186682be50c148c0826fa7c314087562": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_1f427d4273e04e19b1bdb13388736c01", + "placeholder": "", + "style": "IPY_MODEL_38897429b7cf4077aea3a981593ca866", + "value": " 1/1 [00:00<00:00, 15.09it/s]" + } + }, + "1f427d4273e04e19b1bdb13388736c01": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "2082554eed6644a996f0e31545789e08": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_a0be415018644c3cac098ab9b19c2391", + "IPY_MODEL_6ede3649e8c24015b3ca77490568bfcd", + "IPY_MODEL_116139bfe7a44f969a2c97490c224d31" + ], + "layout": "IPY_MODEL_243d13828d854880a6adb861ea867734" + } + }, + "2100363a158b4488a58620983aa5bdd4": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "243d13828d854880a6adb861ea867734": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "277101c35a784e6caf455a13cd9b8e59": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "2924814bab5748ddbeeedc70d324195e": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_4738bccc6b384da5a20a8bcd61ecec59", + "IPY_MODEL_044d6d8dda1c4935b1752a9c71c6ee4a", + "IPY_MODEL_9277709ad9154d7b8f37d08db84ee425" + ], + "layout": "IPY_MODEL_f3f1f2487d6f455caeb6ec71a2d51ee2" + } + }, + "2958af7c9cdb46038e0336d6b7c6773e": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "351928faa62543128e0bd29bf89bbf79": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "38897429b7cf4077aea3a981593ca866": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "3978f618c4f8467eb83c63a8f5aef98a": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "3d0376d2e574410eb4ef963d51cac0a6": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_9054d3825edb49cb9c35d24023f50c03", + "max": 1, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_3978f618c4f8467eb83c63a8f5aef98a", + "value": 1 + } + }, + "425c6c0eaed741669551b9af77096c6f": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_d124b09896934d289df649375f455a8e", + "IPY_MODEL_554cff1a83d44bd2bbd36fd43acac7e2", + "IPY_MODEL_d0381718fc8b49a6ac7e7fe85cabba90" + ], + "layout": "IPY_MODEL_fd3daaf9093d45d8a9d39b87835f4582" + } + }, + "457374ae3035496eb943ad21484f76a0": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_bcf4679dda2d4767a0a24cbf236ca76e", + "IPY_MODEL_6e4ce98853c84beca11471e7ea9d97df", + "IPY_MODEL_186682be50c148c0826fa7c314087562" + ], + "layout": "IPY_MODEL_e1ef246e3e6c4359b7b61c341119e121" + } + }, + "45b569d733f944d29cefae8a5d13b215": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "4738bccc6b384da5a20a8bcd61ecec59": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_66c92a8a89234a61a8c688cf1c3e29a1", + "placeholder": "", + "style": "IPY_MODEL_ee1f4a0c85e44a3b849283337743a8d4", + "value": "Batches: 100%" + } + }, + "4a405d391b974e58a2c4fe00d4bb5815": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "4ad57f5d8a824afab639e8606ee43ca6": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "53865d3f918e468ab53504133b127973": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "554cff1a83d44bd2bbd36fd43acac7e2": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_6c60c8291e734f549e6c5a46b427b974", + "max": 1, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_de88640505c24928904a3c76bda31c70", + "value": 1 + } + }, + "5afdb88e0159462e98773560e3dad439": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_f7bc4df675a141e380d965138552a142", + "IPY_MODEL_d7bf8b49145843ac98a6de424e628729", + "IPY_MODEL_8fb17faf68524de2b73321d71b80b407" + ], + "layout": "IPY_MODEL_45b569d733f944d29cefae8a5d13b215" + } + }, + "5cb841b49eaa429e8616ec4b78f501e9": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "5f19dab8c6da4050bc47fd78838f7530": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "6259ffc3ef674df985fd3fa4334f9c8e": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_4a405d391b974e58a2c4fe00d4bb5815", + "placeholder": "", + "style": "IPY_MODEL_2958af7c9cdb46038e0336d6b7c6773e", + "value": "Batches: 100%" + } + }, + "63f34c3d43bb4fdd9faeb6161fd77285": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "66c92a8a89234a61a8c688cf1c3e29a1": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "6c60c8291e734f549e6c5a46b427b974": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "6e4ce98853c84beca11471e7ea9d97df": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_a0ac7ee92d994c7b9b74e580ab2acdf7", + "max": 1, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_118b359b83304ae59fad57e28f621645", + "value": 1 + } + }, + "6ede3649e8c24015b3ca77490568bfcd": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_f10237315e794539a00ca82bfff930be", + "max": 1, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_ca09d2207b00456da4c37b5a782a190c", + "value": 1 + } + }, + "753dbe7891a143118b55eccf8c252e03": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "8fb17faf68524de2b73321d71b80b407": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_277101c35a784e6caf455a13cd9b8e59", + "placeholder": "", + "style": "IPY_MODEL_d06666f765764f949e1876f2d5d67242", + "value": " 1/1 [00:01<00:00, 1.68s/it]" + } + }, + "9054d3825edb49cb9c35d24023f50c03": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "92135b9cb201475681ee0886887c84a8": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "9277709ad9154d7b8f37d08db84ee425": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_a447ea9af3e14e5e94eb14ed8dd3c0de", + "placeholder": "", + "style": "IPY_MODEL_0243626d7ef44ef2b90e8fed5c13183d", + "value": " 1/1 [00:02<00:00, 2.65s/it]" + } + }, + "a0ac7ee92d994c7b9b74e580ab2acdf7": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "a0be415018644c3cac098ab9b19c2391": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_e4b1dfe159304c5f88766b33e85a5c19", + "placeholder": "", + "style": "IPY_MODEL_2100363a158b4488a58620983aa5bdd4", + "value": "Batches: 100%" + } + }, + "a447ea9af3e14e5e94eb14ed8dd3c0de": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "a6a1eb412f204578b80e5b6717c1e3a5": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "ab1f339cba094c918fc5507f8361de5c": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "b66984cc5de541a5801a1e6e54d40daf": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_efd68f6dc0b3428e8f5fc830c1bf2341", + "placeholder": "", + "style": "IPY_MODEL_4ad57f5d8a824afab639e8606ee43ca6", + "value": " 1/1 [00:00<00:00, 5.36it/s]" + } + }, + "bbb93c771a9c453bb90e729b1f73b931": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "bcf4679dda2d4767a0a24cbf236ca76e": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_bbb93c771a9c453bb90e729b1f73b931", + "placeholder": "", + "style": "IPY_MODEL_351928faa62543128e0bd29bf89bbf79", + "value": "Batches: 100%" + } + }, + "ca09d2207b00456da4c37b5a782a190c": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "ce7de1af99434ad38a9382e7253dbfc0": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "d0381718fc8b49a6ac7e7fe85cabba90": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_fc086d0dd1a745308c59ae219ae135c5", + "placeholder": "", + "style": "IPY_MODEL_15d3ff07f1c54e58b51d452caca01209", + "value": " 1/1 [00:00<00:00, 14.36it/s]" + } + }, + "d06666f765764f949e1876f2d5d67242": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "d124b09896934d289df649375f455a8e": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_753dbe7891a143118b55eccf8c252e03", + "placeholder": "", + "style": "IPY_MODEL_ce7de1af99434ad38a9382e7253dbfc0", + "value": "Batches: 100%" + } + }, + "d7bf8b49145843ac98a6de424e628729": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_17603dd7fedf4798a74533fbfd5bb421", + "max": 1, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_5f19dab8c6da4050bc47fd78838f7530", + "value": 1 + } + }, + "de88640505c24928904a3c76bda31c70": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "e1ef246e3e6c4359b7b61c341119e121": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "e4b1dfe159304c5f88766b33e85a5c19": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "ee1f4a0c85e44a3b849283337743a8d4": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "efd68f6dc0b3428e8f5fc830c1bf2341": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "f10237315e794539a00ca82bfff930be": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "f3f1f2487d6f455caeb6ec71a2d51ee2": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "f7bc4df675a141e380d965138552a142": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_fdd057a4506f4f119d945bab5b930799", + "placeholder": "", + "style": "IPY_MODEL_53865d3f918e468ab53504133b127973", + "value": "Batches: 100%" + } + }, + "fc086d0dd1a745308c59ae219ae135c5": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "fd3daaf9093d45d8a9d39b87835f4582": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "fdd057a4506f4f119d945bab5b930799": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + } + } + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/source/getting_started/index.md b/docs/source/getting_started/index.md index c6227db99..d7c3fe9e5 100644 --- a/docs/source/getting_started/index.md +++ b/docs/source/getting_started/index.md @@ -43,7 +43,7 @@ Configuration for this is available at `distributions/ollama/run.yaml`. ### 3. Use the Llama Stack client SDK -You can interact with the Llama Stack server using various client SDKs. We will use the Python SDK which you can install using: +You can interact with the Llama Stack server using various client SDKs. We will use the Python SDK which you can install using the following command. Note that you must be using Python 3.10 or newer: ```bash pip install llama-stack-client ``` @@ -51,7 +51,8 @@ pip install llama-stack-client Let's use the `llama-stack-client` CLI to check the connectivity to the server. ```bash -llama-stack-client --endpoint http://localhost:$LLAMA_STACK_PORT models list +llama-stack-client configure --endpoint http://localhost:$LLAMA_STACK_PORT +llama-stack-client models list ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ identifier ┃ provider_id ┃ provider_resource_id ┃ metadata ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━┩ @@ -61,7 +62,7 @@ llama-stack-client --endpoint http://localhost:$LLAMA_STACK_PORT models list You can test basic Llama inference completion using the CLI too. ```bash -llama-stack-client --endpoint http://localhost:$LLAMA_STACK_PORT \ +llama-stack-client \ inference chat-completion \ --message "hello, what model are you?" ``` @@ -153,10 +154,3 @@ if __name__ == "__main__": - Learn how to [Build Llama Stacks](../distributions/index.md) - See [References](../references/index.md) for more details about the llama CLI and Python SDK - For example applications and more detailed tutorials, visit our [llama-stack-apps](https://github.com/meta-llama/llama-stack-apps/tree/main/examples) repository. - - -## Thinking out aloud here in terms of what to write in the docs - -- how to get a llama stack server running -- what are all the different client sdks -- what are the components of building agents diff --git a/docs/source/playground/index.md b/docs/source/playground/index.md index e15b4a48e..d74bf1a03 100644 --- a/docs/source/playground/index.md +++ b/docs/source/playground/index.md @@ -16,7 +16,7 @@ Interactive pages for users to play with and explore Llama Stack API capabilitie ##### Chatbot ```{eval-rst} -.. video:: https://github.com/user-attachments/assets/6ca617e8-32ca-49b2-9774-185020ff5204 +.. video:: https://github.com/user-attachments/assets/8d2ef802-5812-4a28-96e1-316038c84cbf :autoplay: :playsinline: :muted: diff --git a/docs/source/references/evals_reference/index.md b/docs/source/references/evals_reference/index.md index 9ba4f2848..f93b56e64 100644 --- a/docs/source/references/evals_reference/index.md +++ b/docs/source/references/evals_reference/index.md @@ -47,7 +47,7 @@ This first example walks you through how to evaluate a model candidate served by - [SimpleQA](https://openai.com/index/introducing-simpleqa/): Benchmark designed to access models to answer short, fact-seeking questions. #### 1.1 Running MMMU -- We will use a pre-processed MMMU dataset from [llamastack/mmmu](https://huggingface.co/datasets/llamastack/mmmu). The preprocessing code is shown in in this [Github Gist](https://gist.github.com/yanxi0830/118e9c560227d27132a7fd10e2c92840). The dataset is obtained by transforming the original [MMMU/MMMU](https://huggingface.co/datasets/MMMU/MMMU) dataset into correct format by `inference/chat-completion` API. +- We will use a pre-processed MMMU dataset from [llamastack/mmmu](https://huggingface.co/datasets/llamastack/mmmu). The preprocessing code is shown in this [GitHub Gist](https://gist.github.com/yanxi0830/118e9c560227d27132a7fd10e2c92840). The dataset is obtained by transforming the original [MMMU/MMMU](https://huggingface.co/datasets/MMMU/MMMU) dataset into correct format by `inference/chat-completion` API. ```python import datasets diff --git a/docs/zero_to_hero_guide/06_Safety101.ipynb b/docs/zero_to_hero_guide/06_Safety101.ipynb index 6b5bd53bf..e2ba5e22e 100644 --- a/docs/zero_to_hero_guide/06_Safety101.ipynb +++ b/docs/zero_to_hero_guide/06_Safety101.ipynb @@ -67,7 +67,7 @@ "from termcolor import cprint\n", "\n", "from llama_stack.distribution.datatypes import RemoteProviderConfig\n", - "from llama_stack.apis.safety import * # noqa: F403\n", + "from llama_stack.apis.safety import Safety\n", "from llama_stack_client import LlamaStackClient\n", "\n", "\n", @@ -127,7 +127,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.15" + "version": "3.11.10" } }, "nbformat": 4, diff --git a/llama_stack/apis/agents/agents.py b/llama_stack/apis/agents/agents.py index 5fd90ae7a..5748b4e41 100644 --- a/llama_stack/apis/agents/agents.py +++ b/llama_stack/apis/agents/agents.py @@ -18,18 +18,30 @@ from typing import ( Union, ) +from llama_models.llama3.api.datatypes import ToolParamDefinition + from llama_models.schema_utils import json_schema_type, webmethod from pydantic import BaseModel, ConfigDict, Field from typing_extensions import Annotated -from llama_stack.providers.utils.telemetry.trace_protocol import trace_protocol -from llama_models.llama3.api.datatypes import * # noqa: F403 -from llama_stack.apis.common.deployment_types import * # noqa: F403 -from llama_stack.apis.inference import * # noqa: F403 -from llama_stack.apis.safety import * # noqa: F403 -from llama_stack.apis.memory import * # noqa: F403 from llama_stack.apis.common.content_types import InterleavedContent, URL +from llama_stack.apis.common.deployment_types import RestAPIExecutionConfig +from llama_stack.apis.inference import ( + CompletionMessage, + SamplingParams, + ToolCall, + ToolCallDelta, + ToolChoice, + ToolPromptFormat, + ToolResponse, + ToolResponseMessage, + UserMessage, +) +from llama_stack.apis.memory import MemoryBank +from llama_stack.apis.safety import SafetyViolation + +from llama_stack.providers.utils.telemetry.trace_protocol import trace_protocol @json_schema_type diff --git a/llama_stack/apis/agents/event_logger.py b/llama_stack/apis/agents/event_logger.py index 4c379999e..40a69d19c 100644 --- a/llama_stack/apis/agents/event_logger.py +++ b/llama_stack/apis/agents/event_logger.py @@ -6,13 +6,14 @@ from typing import Optional -from llama_models.llama3.api.datatypes import * # noqa: F403 +from llama_models.llama3.api.datatypes import ToolPromptFormat from llama_models.llama3.api.tool_utils import ToolUtils - from termcolor import cprint from llama_stack.apis.agents import AgentTurnResponseEventType, StepType +from llama_stack.apis.inference import ToolResponseMessage + class LogEvent: def __init__( diff --git a/llama_stack/apis/batch_inference/batch_inference.py b/llama_stack/apis/batch_inference/batch_inference.py index 358cf3c35..f7b8b4387 100644 --- a/llama_stack/apis/batch_inference/batch_inference.py +++ b/llama_stack/apis/batch_inference/batch_inference.py @@ -10,8 +10,16 @@ from llama_models.schema_utils import json_schema_type, webmethod from pydantic import BaseModel, Field -from llama_models.llama3.api.datatypes import * # noqa: F403 -from llama_stack.apis.inference import * # noqa: F403 +from llama_stack.apis.inference import ( + CompletionMessage, + InterleavedContent, + LogProbConfig, + Message, + SamplingParams, + ToolChoice, + ToolDefinition, + ToolPromptFormat, +) @json_schema_type diff --git a/llama_stack/apis/common/content_types.py b/llama_stack/apis/common/content_types.py index 121218a29..629e0e94d 100644 --- a/llama_stack/apis/common/content_types.py +++ b/llama_stack/apis/common/content_types.py @@ -4,11 +4,12 @@ # This source code is licensed under the terms described in the LICENSE file in # the root directory of this source tree. +import base64 from typing import Annotated, List, Literal, Optional, Union from llama_models.schema_utils import json_schema_type, register_schema -from pydantic import BaseModel, Field, model_validator +from pydantic import BaseModel, Field, field_serializer, model_validator @json_schema_type @@ -27,6 +28,12 @@ class _URLOrData(BaseModel): return values return {"url": values} + @field_serializer("data") + def serialize_data(self, data: Optional[bytes], _info): + if data is None: + return None + return base64.b64encode(data).decode("utf-8") + @json_schema_type class ImageContentItem(_URLOrData): diff --git a/llama_stack/apis/datasetio/datasetio.py b/llama_stack/apis/datasetio/datasetio.py index 22acc3211..983e0e4ea 100644 --- a/llama_stack/apis/datasetio/datasetio.py +++ b/llama_stack/apis/datasetio/datasetio.py @@ -9,7 +9,7 @@ from typing import Any, Dict, List, Optional, Protocol, runtime_checkable from llama_models.schema_utils import json_schema_type, webmethod from pydantic import BaseModel -from llama_stack.apis.datasets import * # noqa: F403 +from llama_stack.apis.datasets import Dataset @json_schema_type diff --git a/llama_stack/apis/eval/eval.py b/llama_stack/apis/eval/eval.py index 2e0ce1fbc..2592bca37 100644 --- a/llama_stack/apis/eval/eval.py +++ b/llama_stack/apis/eval/eval.py @@ -4,18 +4,18 @@ # This source code is licensed under the terms described in the LICENSE file in # the root directory of this source tree. -from typing import Literal, Optional, Protocol, Union +from typing import Any, Dict, List, Literal, Optional, Protocol, Union + +from llama_models.llama3.api.datatypes import BaseModel, Field +from llama_models.schema_utils import json_schema_type, webmethod from typing_extensions import Annotated -from llama_models.llama3.api.datatypes import * # noqa: F403 -from llama_models.schema_utils import json_schema_type, webmethod -from llama_stack.apis.scoring_functions import * # noqa: F403 from llama_stack.apis.agents import AgentConfig from llama_stack.apis.common.job_types import Job, JobStatus -from llama_stack.apis.scoring import * # noqa: F403 -from llama_stack.apis.eval_tasks import * # noqa: F403 from llama_stack.apis.inference import SamplingParams, SystemMessage +from llama_stack.apis.scoring import ScoringResult +from llama_stack.apis.scoring_functions import ScoringFnParams @json_schema_type diff --git a/llama_stack/apis/inference/inference.py b/llama_stack/apis/inference/inference.py index 28b9d9106..e48042091 100644 --- a/llama_stack/apis/inference/inference.py +++ b/llama_stack/apis/inference/inference.py @@ -7,7 +7,9 @@ from enum import Enum from typing import ( + Any, AsyncIterator, + Dict, List, Literal, Optional, @@ -32,8 +34,9 @@ from typing_extensions import Annotated from llama_stack.apis.common.content_types import InterleavedContent +from llama_stack.apis.models import Model + from llama_stack.providers.utils.telemetry.trace_protocol import trace_protocol -from llama_stack.apis.models import * # noqa: F403 class LogProbConfig(BaseModel): diff --git a/llama_stack/apis/post_training/post_training.py b/llama_stack/apis/post_training/post_training.py index fdbaa364d..1c2d2d6e2 100644 --- a/llama_stack/apis/post_training/post_training.py +++ b/llama_stack/apis/post_training/post_training.py @@ -7,17 +7,17 @@ from datetime import datetime from enum import Enum -from typing import Any, Dict, List, Optional, Protocol, Union +from typing import Any, Dict, List, Literal, Optional, Protocol, Union from llama_models.schema_utils import json_schema_type, webmethod from pydantic import BaseModel, Field from typing_extensions import Annotated -from llama_models.llama3.api.datatypes import * # noqa: F403 +from llama_stack.apis.common.content_types import URL + from llama_stack.apis.common.job_types import JobStatus -from llama_stack.apis.datasets import * # noqa: F403 -from llama_stack.apis.common.training_types import * # noqa: F403 +from llama_stack.apis.common.training_types import Checkpoint @json_schema_type diff --git a/llama_stack/apis/resource.py b/llama_stack/apis/resource.py index 93a3718a0..a85f5a31c 100644 --- a/llama_stack/apis/resource.py +++ b/llama_stack/apis/resource.py @@ -18,6 +18,8 @@ class ResourceType(Enum): dataset = "dataset" scoring_function = "scoring_function" eval_task = "eval_task" + tool = "tool" + tool_group = "tool_group" class Resource(BaseModel): diff --git a/llama_stack/apis/scoring/scoring.py b/llama_stack/apis/scoring/scoring.py index e062d0f57..996291dcc 100644 --- a/llama_stack/apis/scoring/scoring.py +++ b/llama_stack/apis/scoring/scoring.py @@ -4,13 +4,12 @@ # This source code is licensed under the terms described in the LICENSE file in # the root directory of this source tree. -from typing import Any, Dict, List, Protocol, runtime_checkable +from typing import Any, Dict, List, Optional, Protocol, runtime_checkable from llama_models.schema_utils import json_schema_type, webmethod from pydantic import BaseModel -from llama_models.llama3.api.datatypes import * # noqa: F403 -from llama_stack.apis.scoring_functions import * # noqa: F403 +from llama_stack.apis.scoring_functions import ScoringFn, ScoringFnParams # mapping of metric to value diff --git a/llama_stack/apis/synthetic_data_generation/synthetic_data_generation.py b/llama_stack/apis/synthetic_data_generation/synthetic_data_generation.py index 4ffaa4d1e..13b209912 100644 --- a/llama_stack/apis/synthetic_data_generation/synthetic_data_generation.py +++ b/llama_stack/apis/synthetic_data_generation/synthetic_data_generation.py @@ -6,13 +6,12 @@ from enum import Enum -from typing import Any, Dict, List, Optional, Protocol +from typing import Any, Dict, List, Optional, Protocol, Union from llama_models.schema_utils import json_schema_type, webmethod from pydantic import BaseModel -from llama_models.llama3.api.datatypes import * # noqa: F403 from llama_stack.apis.inference import Message diff --git a/llama_stack/apis/tools/__init__.py b/llama_stack/apis/tools/__init__.py new file mode 100644 index 000000000..f747fcdc2 --- /dev/null +++ b/llama_stack/apis/tools/__init__.py @@ -0,0 +1,7 @@ +# Copyright (c) Meta Platforms, Inc. and affiliates. +# All rights reserved. +# +# This source code is licensed under the terms described in the LICENSE file in +# the root directory of this source tree. + +from .tools import * # noqa: F401 F403 diff --git a/llama_stack/apis/tools/tools.py b/llama_stack/apis/tools/tools.py new file mode 100644 index 000000000..23110543b --- /dev/null +++ b/llama_stack/apis/tools/tools.py @@ -0,0 +1,141 @@ +# Copyright (c) Meta Platforms, Inc. and affiliates. +# All rights reserved. +# +# This source code is licensed under the terms described in the LICENSE file in +# the root directory of this source tree. + +from typing import Annotated, Any, Dict, List, Literal, Optional, Union + +from llama_models.llama3.api.datatypes import ToolPromptFormat +from llama_models.schema_utils import json_schema_type, register_schema, webmethod +from pydantic import BaseModel, Field +from typing_extensions import Protocol, runtime_checkable + +from llama_stack.apis.common.content_types import InterleavedContent, URL +from llama_stack.apis.resource import Resource, ResourceType +from llama_stack.providers.utils.telemetry.trace_protocol import trace_protocol + + +@json_schema_type +class ToolParameter(BaseModel): + name: str + parameter_type: str + description: str + + +@json_schema_type +class Tool(Resource): + type: Literal[ResourceType.tool.value] = ResourceType.tool.value + tool_group: str + description: str + parameters: List[ToolParameter] + provider_id: Optional[str] = None + metadata: Optional[Dict[str, Any]] = None + tool_prompt_format: Optional[ToolPromptFormat] = Field( + default=ToolPromptFormat.json + ) + + +@json_schema_type +class ToolDef(BaseModel): + name: str + description: str + parameters: List[ToolParameter] + metadata: Dict[str, Any] + tool_prompt_format: Optional[ToolPromptFormat] = Field( + default=ToolPromptFormat.json + ) + + +@json_schema_type +class MCPToolGroupDef(BaseModel): + """ + A tool group that is defined by in a model context protocol server. + Refer to https://modelcontextprotocol.io/docs/concepts/tools for more information. + """ + + type: Literal["model_context_protocol"] = "model_context_protocol" + endpoint: URL + + +@json_schema_type +class UserDefinedToolGroupDef(BaseModel): + type: Literal["user_defined"] = "user_defined" + tools: List[ToolDef] + + +ToolGroupDef = register_schema( + Annotated[ + Union[MCPToolGroupDef, UserDefinedToolGroupDef], Field(discriminator="type") + ], + name="ToolGroup", +) + + +class ToolGroup(Resource): + type: Literal[ResourceType.tool_group.value] = ResourceType.tool_group.value + + +@json_schema_type +class ToolInvocationResult(BaseModel): + content: InterleavedContent + error_message: Optional[str] = None + error_code: Optional[int] = None + + +class ToolStore(Protocol): + def get_tool(self, tool_name: str) -> Tool: ... + + +@runtime_checkable +@trace_protocol +class ToolGroups(Protocol): + @webmethod(route="/toolgroups/register", method="POST") + async def register_tool_group( + self, + tool_group_id: str, + tool_group: ToolGroupDef, + provider_id: Optional[str] = None, + ) -> None: + """Register a tool group""" + ... + + @webmethod(route="/toolgroups/get", method="GET") + async def get_tool_group( + self, + tool_group_id: str, + ) -> ToolGroup: ... + + @webmethod(route="/toolgroups/list", method="GET") + async def list_tool_groups(self) -> List[ToolGroup]: + """List tool groups with optional provider""" + ... + + @webmethod(route="/tools/list", method="GET") + async def list_tools(self, tool_group_id: Optional[str] = None) -> List[Tool]: + """List tools with optional tool group""" + ... + + @webmethod(route="/tools/get", method="GET") + async def get_tool(self, tool_name: str) -> Tool: ... + + @webmethod(route="/toolgroups/unregister", method="POST") + async def unregister_tool_group(self, tool_group_id: str) -> None: + """Unregister a tool group""" + ... + + +@runtime_checkable +@trace_protocol +class ToolRuntime(Protocol): + tool_store: ToolStore + + @webmethod(route="/tool-runtime/discover", method="POST") + async def discover_tools(self, tool_group: ToolGroupDef) -> List[ToolDef]: ... + + @webmethod(route="/tool-runtime/invoke", method="POST") + async def invoke_tool( + self, tool_name: str, args: Dict[str, Any] + ) -> ToolInvocationResult: + """Run a tool with the given arguments""" + ... diff --git a/llama_stack/cli/model/safety_models.py b/llama_stack/cli/model/safety_models.py index 39c133f73..9464e0a2d 100644 --- a/llama_stack/cli/model/safety_models.py +++ b/llama_stack/cli/model/safety_models.py @@ -6,11 +6,12 @@ from typing import Any, Dict, Optional -from pydantic import BaseModel, ConfigDict, Field - -from llama_models.datatypes import * # noqa: F403 +from llama_models.datatypes import CheckpointQuantizationFormat +from llama_models.llama3.api.datatypes import SamplingParams from llama_models.sku_list import LlamaDownloadInfo +from pydantic import BaseModel, ConfigDict, Field + class PromptGuardModel(BaseModel): """Make a 'fake' Model-like object for Prompt Guard. Eventually this will be removed.""" diff --git a/llama_stack/cli/stack/build.py b/llama_stack/cli/stack/build.py index 0cb873b57..54d78ad93 100644 --- a/llama_stack/cli/stack/build.py +++ b/llama_stack/cli/stack/build.py @@ -3,21 +3,28 @@ # # This source code is licensed under the terms described in the LICENSE file in # the root directory of this source tree. - import argparse - -from llama_stack.cli.subcommand import Subcommand -from llama_stack.distribution.datatypes import * # noqa: F403 import os import shutil from functools import lru_cache from pathlib import Path +from typing import List, Optional import pkg_resources +from llama_stack.cli.subcommand import Subcommand + +from llama_stack.distribution.datatypes import ( + BuildConfig, + DistributionSpec, + Provider, + StackRunConfig, +) + from llama_stack.distribution.distribution import get_provider_registry from llama_stack.distribution.resolver import InvalidProviderError from llama_stack.distribution.utils.dynamic import instantiate_class_type +from llama_stack.providers.datatypes import Api TEMPLATES_PATH = Path(__file__).parent.parent.parent / "templates" @@ -100,7 +107,7 @@ class StackBuild(Subcommand): build_config.image_type = args.image_type else: self.parser.error( - f"Please specify a image-type (docker | conda) for {args.template}" + f"Please specify a image-type (docker | conda | venv) for {args.template}" ) self._run_stack_build_command_from_build_config( build_config, template_name=args.template @@ -122,7 +129,7 @@ class StackBuild(Subcommand): ) image_type = prompt( - "> Enter the image type you want your Llama Stack to be built as (docker or conda): ", + "> Enter the image type you want your Llama Stack to be built as (docker or conda or venv): ", validator=Validator.from_callable( lambda x: x in ["docker", "conda", "venv"], error_message="Invalid image type, please enter conda or docker or venv", diff --git a/llama_stack/distribution/build.py b/llama_stack/distribution/build.py index bdda0349f..f376301f9 100644 --- a/llama_stack/distribution/build.py +++ b/llama_stack/distribution/build.py @@ -6,21 +6,22 @@ import logging from enum import Enum -from typing import List + +from pathlib import Path +from typing import Dict, List import pkg_resources from pydantic import BaseModel from termcolor import cprint -from llama_stack.distribution.utils.exec import run_with_pty - -from llama_stack.distribution.datatypes import * # noqa: F403 -from pathlib import Path +from llama_stack.distribution.datatypes import BuildConfig, Provider from llama_stack.distribution.distribution import get_provider_registry from llama_stack.distribution.utils.config_dirs import BUILDS_BASE_DIR +from llama_stack.distribution.utils.exec import run_with_pty +from llama_stack.providers.datatypes import Api log = logging.getLogger(__name__) diff --git a/llama_stack/distribution/configure.py b/llama_stack/distribution/configure.py index a4d0f970b..71c2676de 100644 --- a/llama_stack/distribution/configure.py +++ b/llama_stack/distribution/configure.py @@ -6,10 +6,14 @@ import logging import textwrap -from typing import Any - -from llama_stack.distribution.datatypes import * # noqa: F403 +from typing import Any, Dict +from llama_stack.distribution.datatypes import ( + DistributionSpec, + LLAMA_STACK_RUN_CONFIG_VERSION, + Provider, + StackRunConfig, +) from llama_stack.distribution.distribution import ( builtin_automatically_routed_apis, get_provider_registry, @@ -17,10 +21,7 @@ from llama_stack.distribution.distribution import ( from llama_stack.distribution.utils.dynamic import instantiate_class_type from llama_stack.distribution.utils.prompt_for_config import prompt_for_config - -from llama_stack.apis.models import * # noqa: F403 -from llama_stack.apis.shields import * # noqa: F403 -from llama_stack.apis.memory_banks import * # noqa: F403 +from llama_stack.providers.datatypes import Api, ProviderSpec logger = logging.getLogger(__name__) diff --git a/llama_stack/distribution/datatypes.py b/llama_stack/distribution/datatypes.py index 1159372d4..dec62bfae 100644 --- a/llama_stack/distribution/datatypes.py +++ b/llama_stack/distribution/datatypes.py @@ -4,23 +4,24 @@ # This source code is licensed under the terms described in the LICENSE file in # the root directory of this source tree. -from typing import Dict, List, Optional, Union +from typing import Annotated, Any, Dict, List, Optional, Union from pydantic import BaseModel, Field -from llama_stack.providers.datatypes import * # noqa: F403 -from llama_stack.apis.models import * # noqa: F403 -from llama_stack.apis.shields import * # noqa: F403 -from llama_stack.apis.memory_banks import * # noqa: F403 -from llama_stack.apis.datasets import * # noqa: F403 -from llama_stack.apis.scoring_functions import * # noqa: F403 from llama_stack.apis.datasetio import DatasetIO +from llama_stack.apis.datasets import Dataset, DatasetInput from llama_stack.apis.eval import Eval -from llama_stack.apis.eval_tasks import EvalTaskInput +from llama_stack.apis.eval_tasks import EvalTask, EvalTaskInput from llama_stack.apis.inference import Inference from llama_stack.apis.memory import Memory +from llama_stack.apis.memory_banks import MemoryBank, MemoryBankInput +from llama_stack.apis.models import Model, ModelInput from llama_stack.apis.safety import Safety from llama_stack.apis.scoring import Scoring +from llama_stack.apis.scoring_functions import ScoringFn, ScoringFnInput +from llama_stack.apis.shields import Shield, ShieldInput +from llama_stack.apis.tools import Tool, ToolGroup, ToolRuntime +from llama_stack.providers.datatypes import Api, ProviderSpec from llama_stack.providers.utils.kvstore.config import KVStoreConfig LLAMA_STACK_BUILD_CONFIG_VERSION = "2" @@ -37,6 +38,8 @@ RoutableObject = Union[ Dataset, ScoringFn, EvalTask, + Tool, + ToolGroup, ] @@ -48,6 +51,8 @@ RoutableObjectWithProvider = Annotated[ Dataset, ScoringFn, EvalTask, + Tool, + ToolGroup, ], Field(discriminator="type"), ] @@ -59,6 +64,7 @@ RoutedProtocol = Union[ DatasetIO, Scoring, Eval, + ToolRuntime, ] diff --git a/llama_stack/distribution/distribution.py b/llama_stack/distribution/distribution.py index 6fc4545c7..4183d92cd 100644 --- a/llama_stack/distribution/distribution.py +++ b/llama_stack/distribution/distribution.py @@ -47,6 +47,10 @@ def builtin_automatically_routed_apis() -> List[AutoRoutedApiInfo]: routing_table_api=Api.eval_tasks, router_api=Api.eval, ), + AutoRoutedApiInfo( + routing_table_api=Api.tool_groups, + router_api=Api.tool_runtime, + ), ] diff --git a/llama_stack/distribution/inspect.py b/llama_stack/distribution/inspect.py index f5716ef5e..dbb16d8ce 100644 --- a/llama_stack/distribution/inspect.py +++ b/llama_stack/distribution/inspect.py @@ -5,12 +5,12 @@ # the root directory of this source tree. from typing import Dict, List -from llama_stack.apis.inspect import * # noqa: F403 + from pydantic import BaseModel +from llama_stack.apis.inspect import HealthInfo, Inspect, ProviderInfo, RouteInfo +from llama_stack.distribution.datatypes import StackRunConfig from llama_stack.distribution.server.endpoints import get_all_api_endpoints -from llama_stack.providers.datatypes import * # noqa: F403 -from llama_stack.distribution.datatypes import * # noqa: F403 class DistributionInspectConfig(BaseModel): diff --git a/llama_stack/distribution/library_client.py b/llama_stack/distribution/library_client.py index 14f62e3a6..48fcc437b 100644 --- a/llama_stack/distribution/library_client.py +++ b/llama_stack/distribution/library_client.py @@ -67,6 +67,7 @@ def in_notebook(): def stream_across_asyncio_run_boundary( async_gen_maker, pool_executor: ThreadPoolExecutor, + path: Optional[str] = None, ) -> Generator[T, None, None]: result_queue = queue.Queue() stop_event = threading.Event() @@ -74,6 +75,7 @@ def stream_across_asyncio_run_boundary( async def consumer(): # make sure we make the generator in the event loop context gen = await async_gen_maker() + await start_trace(path, {"__location__": "library_client"}) try: async for item in await gen: result_queue.put(item) @@ -85,6 +87,7 @@ def stream_across_asyncio_run_boundary( finally: result_queue.put(StopIteration) stop_event.set() + await end_trace() def run_async(): # Run our own loop to avoid double async generator cleanup which is done @@ -186,14 +189,34 @@ class LlamaStackAsLibraryClient(LlamaStackClient): return asyncio.run(self.async_client.initialize()) + def _get_path( + self, + cast_to: Any, + options: Any, + *, + stream=False, + stream_cls=None, + ): + return options.url + def request(self, *args, **kwargs): + path = self._get_path(*args, **kwargs) if kwargs.get("stream"): return stream_across_asyncio_run_boundary( lambda: self.async_client.request(*args, **kwargs), self.pool_executor, + path=path, ) else: - return asyncio.run(self.async_client.request(*args, **kwargs)) + + async def _traced_request(): + await start_trace(path, {"__location__": "library_client"}) + try: + return await self.async_client.request(*args, **kwargs) + finally: + await end_trace() + + return asyncio.run(_traced_request()) class AsyncLlamaStackAsLibraryClient(AsyncLlamaStackClient): @@ -206,7 +229,10 @@ class AsyncLlamaStackAsLibraryClient(AsyncLlamaStackClient): # when using the library client, we should not log to console since many # of our logs are intended for server-side usage - os.environ["TELEMETRY_SINKS"] = "sqlite" + current_sinks = os.environ.get("TELEMETRY_SINKS", "sqlite").split(",") + os.environ["TELEMETRY_SINKS"] = ",".join( + sink for sink in current_sinks if sink != "console" + ) if config_path_or_template_name.endswith(".yaml"): config_path = Path(config_path_or_template_name) @@ -295,41 +321,37 @@ class AsyncLlamaStackAsLibraryClient(AsyncLlamaStackClient): body = options.params or {} body |= options.json_data or {} - await start_trace(path, {"__location__": "library_client"}) - try: - func = self.endpoint_impls.get(path) - if not func: - raise ValueError(f"No endpoint found for {path}") + func = self.endpoint_impls.get(path) + if not func: + raise ValueError(f"No endpoint found for {path}") - body = self._convert_body(path, body) - result = await func(**body) + body = self._convert_body(path, body) + result = await func(**body) - json_content = json.dumps(convert_pydantic_to_json_value(result)) - mock_response = httpx.Response( - status_code=httpx.codes.OK, - content=json_content.encode("utf-8"), - headers={ - "Content-Type": "application/json", - }, - request=httpx.Request( - method=options.method, - url=options.url, - params=options.params, - headers=options.headers, - json=options.json_data, - ), - ) - response = APIResponse( - raw=mock_response, - client=self, - cast_to=cast_to, - options=options, - stream=False, - stream_cls=None, - ) - return response.parse() - finally: - await end_trace() + json_content = json.dumps(convert_pydantic_to_json_value(result)) + mock_response = httpx.Response( + status_code=httpx.codes.OK, + content=json_content.encode("utf-8"), + headers={ + "Content-Type": "application/json", + }, + request=httpx.Request( + method=options.method, + url=options.url, + params=options.params, + headers=options.headers, + json=options.json_data, + ), + ) + response = APIResponse( + raw=mock_response, + client=self, + cast_to=cast_to, + options=options, + stream=False, + stream_cls=None, + ) + return response.parse() async def _call_streaming( self, @@ -341,51 +363,47 @@ class AsyncLlamaStackAsLibraryClient(AsyncLlamaStackClient): path = options.url body = options.params or {} body |= options.json_data or {} - await start_trace(path, {"__location__": "library_client"}) - try: - func = self.endpoint_impls.get(path) - if not func: - raise ValueError(f"No endpoint found for {path}") + func = self.endpoint_impls.get(path) + if not func: + raise ValueError(f"No endpoint found for {path}") - body = self._convert_body(path, body) + body = self._convert_body(path, body) - async def gen(): - async for chunk in await func(**body): - data = json.dumps(convert_pydantic_to_json_value(chunk)) - sse_event = f"data: {data}\n\n" - yield sse_event.encode("utf-8") + async def gen(): + async for chunk in await func(**body): + data = json.dumps(convert_pydantic_to_json_value(chunk)) + sse_event = f"data: {data}\n\n" + yield sse_event.encode("utf-8") - mock_response = httpx.Response( - status_code=httpx.codes.OK, - content=gen(), - headers={ - "Content-Type": "application/json", - }, - request=httpx.Request( - method=options.method, - url=options.url, - params=options.params, - headers=options.headers, - json=options.json_data, - ), - ) + mock_response = httpx.Response( + status_code=httpx.codes.OK, + content=gen(), + headers={ + "Content-Type": "application/json", + }, + request=httpx.Request( + method=options.method, + url=options.url, + params=options.params, + headers=options.headers, + json=options.json_data, + ), + ) - # we use asynchronous impl always internally and channel all requests to AsyncLlamaStackClient - # however, the top-level caller may be a SyncAPIClient -- so its stream_cls might be a Stream (SyncStream) - # so we need to convert it to AsyncStream - args = get_args(stream_cls) - stream_cls = AsyncStream[args[0]] - response = AsyncAPIResponse( - raw=mock_response, - client=self, - cast_to=cast_to, - options=options, - stream=True, - stream_cls=stream_cls, - ) - return await response.parse() - finally: - await end_trace() + # we use asynchronous impl always internally and channel all requests to AsyncLlamaStackClient + # however, the top-level caller may be a SyncAPIClient -- so its stream_cls might be a Stream (SyncStream) + # so we need to convert it to AsyncStream + args = get_args(stream_cls) + stream_cls = AsyncStream[args[0]] + response = AsyncAPIResponse( + raw=mock_response, + client=self, + cast_to=cast_to, + options=options, + stream=True, + stream_cls=stream_cls, + ) + return await response.parse() def _convert_body(self, path: str, body: Optional[dict] = None) -> dict: if not body: diff --git a/llama_stack/distribution/resolver.py b/llama_stack/distribution/resolver.py index 4541b01eb..0a6eed345 100644 --- a/llama_stack/distribution/resolver.py +++ b/llama_stack/distribution/resolver.py @@ -6,14 +6,10 @@ import importlib import inspect -from typing import Any, Dict, List, Set - - -from llama_stack.providers.datatypes import * # noqa: F403 -from llama_stack.distribution.datatypes import * # noqa: F403 - import logging +from typing import Any, Dict, List, Set + from llama_stack.apis.agents import Agents from llama_stack.apis.datasetio import DatasetIO from llama_stack.apis.datasets import Datasets @@ -30,11 +26,34 @@ from llama_stack.apis.scoring import Scoring from llama_stack.apis.scoring_functions import ScoringFunctions from llama_stack.apis.shields import Shields from llama_stack.apis.telemetry import Telemetry +from llama_stack.apis.tools import ToolGroups, ToolRuntime from llama_stack.distribution.client import get_client_impl + +from llama_stack.distribution.datatypes import ( + AutoRoutedProviderSpec, + Provider, + RoutingTableProviderSpec, + StackRunConfig, +) from llama_stack.distribution.distribution import builtin_automatically_routed_apis from llama_stack.distribution.store import DistributionRegistry from llama_stack.distribution.utils.dynamic import instantiate_class_type +from llama_stack.providers.datatypes import ( + Api, + DatasetsProtocolPrivate, + EvalTasksProtocolPrivate, + InlineProviderSpec, + MemoryBanksProtocolPrivate, + ModelsProtocolPrivate, + ProviderSpec, + RemoteProviderConfig, + RemoteProviderSpec, + ScoringFunctionsProtocolPrivate, + ShieldsProtocolPrivate, + ToolsProtocolPrivate, +) + log = logging.getLogger(__name__) @@ -60,12 +79,15 @@ def api_protocol_map() -> Dict[Api, Any]: Api.eval: Eval, Api.eval_tasks: EvalTasks, Api.post_training: PostTraining, + Api.tool_groups: ToolGroups, + Api.tool_runtime: ToolRuntime, } def additional_protocols_map() -> Dict[Api, Any]: return { Api.inference: (ModelsProtocolPrivate, Models, Api.models), + Api.tool_groups: (ToolsProtocolPrivate, ToolGroups, Api.tool_groups), Api.memory: (MemoryBanksProtocolPrivate, MemoryBanks, Api.memory_banks), Api.safety: (ShieldsProtocolPrivate, Shields, Api.shields), Api.datasetio: (DatasetsProtocolPrivate, Datasets, Api.datasets), diff --git a/llama_stack/distribution/routers/__init__.py b/llama_stack/distribution/routers/__init__.py index 57e81ac30..f19a2bffc 100644 --- a/llama_stack/distribution/routers/__init__.py +++ b/llama_stack/distribution/routers/__init__.py @@ -4,11 +4,12 @@ # This source code is licensed under the terms described in the LICENSE file in # the root directory of this source tree. -from typing import Any +from typing import Any, Dict -from llama_stack.distribution.datatypes import * # noqa: F403 +from llama_stack.distribution.datatypes import RoutedProtocol from llama_stack.distribution.store import DistributionRegistry +from llama_stack.providers.datatypes import Api, RoutingTable from .routing_tables import ( DatasetsRoutingTable, @@ -17,6 +18,7 @@ from .routing_tables import ( ModelsRoutingTable, ScoringFunctionsRoutingTable, ShieldsRoutingTable, + ToolGroupsRoutingTable, ) @@ -33,6 +35,7 @@ async def get_routing_table_impl( "datasets": DatasetsRoutingTable, "scoring_functions": ScoringFunctionsRoutingTable, "eval_tasks": EvalTasksRoutingTable, + "tool_groups": ToolGroupsRoutingTable, } if api.value not in api_to_tables: @@ -51,6 +54,7 @@ async def get_auto_router_impl(api: Api, routing_table: RoutingTable, _deps) -> MemoryRouter, SafetyRouter, ScoringRouter, + ToolRuntimeRouter, ) api_to_routers = { @@ -60,6 +64,7 @@ async def get_auto_router_impl(api: Api, routing_table: RoutingTable, _deps) -> "datasetio": DatasetIORouter, "scoring": ScoringRouter, "eval": EvalRouter, + "tool_runtime": ToolRuntimeRouter, } if api.value not in api_to_routers: raise ValueError(f"API {api.value} not found in router map") diff --git a/llama_stack/distribution/routers/routers.py b/llama_stack/distribution/routers/routers.py index 586ebfae4..84ef467eb 100644 --- a/llama_stack/distribution/routers/routers.py +++ b/llama_stack/distribution/routers/routers.py @@ -6,15 +6,40 @@ from typing import Any, AsyncGenerator, Dict, List, Optional -from llama_stack.apis.datasetio.datasetio import DatasetIO +from llama_stack.apis.common.content_types import InterleavedContent +from llama_stack.apis.datasetio import DatasetIO, PaginatedRowsResult +from llama_stack.apis.eval import ( + AppEvalTaskConfig, + Eval, + EvalTaskConfig, + EvaluateResponse, + Job, + JobStatus, +) +from llama_stack.apis.inference import ( + EmbeddingsResponse, + Inference, + LogProbConfig, + Message, + ResponseFormat, + SamplingParams, + ToolChoice, + ToolDefinition, + ToolPromptFormat, +) +from llama_stack.apis.memory import Memory, MemoryBankDocument, QueryDocumentsResponse from llama_stack.apis.memory_banks.memory_banks import BankParams -from llama_stack.distribution.datatypes import RoutingTable -from llama_stack.apis.memory import * # noqa: F403 -from llama_stack.apis.inference import * # noqa: F403 -from llama_stack.apis.safety import * # noqa: F403 -from llama_stack.apis.datasetio import * # noqa: F403 -from llama_stack.apis.scoring import * # noqa: F403 -from llama_stack.apis.eval import * # noqa: F403 +from llama_stack.apis.models import ModelType +from llama_stack.apis.safety import RunShieldResponse, Safety +from llama_stack.apis.scoring import ( + ScoreBatchResponse, + ScoreResponse, + Scoring, + ScoringFnParams, +) +from llama_stack.apis.shields import Shield +from llama_stack.apis.tools import Tool, ToolGroupDef, ToolRuntime +from llama_stack.providers.datatypes import RoutingTable class MemoryRouter(Memory): @@ -329,7 +354,6 @@ class EvalRouter(Eval): task_config=task_config, ) - @webmethod(route="/eval/evaluate_rows", method="POST") async def evaluate_rows( self, task_id: str, @@ -372,3 +396,28 @@ class EvalRouter(Eval): task_id, job_id, ) + + +class ToolRuntimeRouter(ToolRuntime): + def __init__( + self, + routing_table: RoutingTable, + ) -> None: + self.routing_table = routing_table + + async def initialize(self) -> None: + pass + + async def shutdown(self) -> None: + pass + + async def invoke_tool(self, tool_name: str, args: Dict[str, Any]) -> Any: + return await self.routing_table.get_provider_impl(tool_name).invoke_tool( + tool_name=tool_name, + args=args, + ) + + async def discover_tools(self, tool_group: ToolGroupDef) -> List[Tool]: + return await self.routing_table.get_provider_impl( + tool_group.name + ).discover_tools(tool_group) diff --git a/llama_stack/distribution/routers/routing_tables.py b/llama_stack/distribution/routers/routing_tables.py index ecf47a054..ab1becfdd 100644 --- a/llama_stack/distribution/routers/routing_tables.py +++ b/llama_stack/distribution/routers/routing_tables.py @@ -8,19 +8,40 @@ from typing import Any, Dict, List, Optional from pydantic import parse_obj_as -from llama_models.llama3.api.datatypes import * # noqa: F403 - -from llama_stack.apis.models import * # noqa: F403 -from llama_stack.apis.shields import * # noqa: F403 -from llama_stack.apis.memory_banks import * # noqa: F403 -from llama_stack.apis.datasets import * # noqa: F403 -from llama_stack.apis.eval_tasks import * # noqa: F403 - from llama_stack.apis.common.content_types import URL - from llama_stack.apis.common.type_system import ParamType +from llama_stack.apis.datasets import Dataset, Datasets +from llama_stack.apis.eval_tasks import EvalTask, EvalTasks +from llama_stack.apis.memory_banks import ( + BankParams, + MemoryBank, + MemoryBanks, + MemoryBankType, +) +from llama_stack.apis.models import Model, Models, ModelType +from llama_stack.apis.resource import ResourceType +from llama_stack.apis.scoring_functions import ( + ScoringFn, + ScoringFnParams, + ScoringFunctions, +) +from llama_stack.apis.shields import Shield, Shields +from llama_stack.apis.tools import ( + MCPToolGroupDef, + Tool, + ToolGroup, + ToolGroupDef, + ToolGroups, + UserDefinedToolGroupDef, +) +from llama_stack.distribution.datatypes import ( + RoutableObject, + RoutableObjectWithProvider, + RoutedProtocol, +) + from llama_stack.distribution.store import DistributionRegistry -from llama_stack.distribution.datatypes import * # noqa: F403 +from llama_stack.providers.datatypes import Api, RoutingTable def get_impl_api(p: Any) -> Api: @@ -45,6 +66,8 @@ async def register_object_with_provider(obj: RoutableObject, p: Any) -> Routable return await p.register_scoring_function(obj) elif api == Api.eval: return await p.register_eval_task(obj) + elif api == Api.tool_runtime: + return await p.register_tool(obj) else: raise ValueError(f"Unknown API {api} for registering object with provider") @@ -57,6 +80,8 @@ async def unregister_object_from_provider(obj: RoutableObject, p: Any) -> None: return await p.unregister_model(obj.identifier) elif api == Api.datasetio: return await p.unregister_dataset(obj.identifier) + elif api == Api.tool_runtime: + return await p.unregister_tool(obj.identifier) else: raise ValueError(f"Unregister not supported for {api}") @@ -104,6 +129,8 @@ class CommonRoutingTableImpl(RoutingTable): await add_objects(scoring_functions, pid, ScoringFn) elif api == Api.eval: p.eval_task_store = self + elif api == Api.tool_runtime: + p.tool_store = self async def shutdown(self) -> None: for p in self.impls_by_provider_id.values(): @@ -125,6 +152,8 @@ class CommonRoutingTableImpl(RoutingTable): return ("Scoring", "scoring_function") elif isinstance(self, EvalTasksRoutingTable): return ("Eval", "eval_task") + elif isinstance(self, ToolGroupsRoutingTable): + return ("Tools", "tool") else: raise ValueError("Unknown routing table type") @@ -461,3 +490,88 @@ class EvalTasksRoutingTable(CommonRoutingTableImpl, EvalTasks): provider_resource_id=provider_eval_task_id, ) await self.register_object(eval_task) + + +class ToolGroupsRoutingTable(CommonRoutingTableImpl, ToolGroups): + async def list_tools(self, tool_group_id: Optional[str] = None) -> List[Tool]: + tools = await self.get_all_with_type("tool") + if tool_group_id: + tools = [tool for tool in tools if tool.tool_group == tool_group_id] + return tools + + async def list_tool_groups(self) -> List[ToolGroup]: + return await self.get_all_with_type("tool_group") + + async def get_tool_group(self, tool_group_id: str) -> ToolGroup: + return await self.get_object_by_identifier("tool_group", tool_group_id) + + async def get_tool(self, tool_name: str) -> Tool: + return await self.get_object_by_identifier("tool", tool_name) + + async def register_tool_group( + self, + tool_group_id: str, + tool_group: ToolGroupDef, + provider_id: Optional[str] = None, + ) -> None: + tools = [] + tool_defs = [] + if provider_id is None: + if len(self.impls_by_provider_id.keys()) > 1: + raise ValueError( + f"No provider_id specified and multiple providers available. Please specify a provider_id. Available providers: {', '.join(self.impls_by_provider_id.keys())}" + ) + provider_id = list(self.impls_by_provider_id.keys())[0] + + if isinstance(tool_group, MCPToolGroupDef): + tool_defs = await self.impls_by_provider_id[provider_id].discover_tools( + tool_group + ) + + elif isinstance(tool_group, UserDefinedToolGroupDef): + tool_defs = tool_group.tools + else: + raise ValueError(f"Unknown tool group: {tool_group}") + + for tool_def in tool_defs: + tools.append( + Tool( + identifier=tool_def.name, + tool_group=tool_group_id, + description=tool_def.description, + parameters=tool_def.parameters, + provider_id=provider_id, + tool_prompt_format=tool_def.tool_prompt_format, + provider_resource_id=tool_def.name, + metadata=tool_def.metadata, + ) + ) + for tool in tools: + existing_tool = await self.get_tool(tool.identifier) + # Compare existing and new object if one exists + if existing_tool: + existing_dict = existing_tool.model_dump() + new_dict = tool.model_dump() + + if existing_dict != new_dict: + raise ValueError( + f"Object {tool.identifier} already exists in registry. Please use a different identifier." + ) + await self.register_object(tool) + + await self.dist_registry.register( + ToolGroup( + identifier=tool_group_id, + provider_id=provider_id, + provider_resource_id=tool_group_id, + ) + ) + + async def unregister_tool_group(self, tool_group_id: str) -> None: + tool_group = await self.get_tool_group(tool_group_id) + if tool_group is None: + raise ValueError(f"Tool group {tool_group_id} not found") + tools = await self.list_tools(tool_group_id) + for tool in tools: + await self.unregister_object(tool) + await self.unregister_object(tool_group) diff --git a/llama_stack/distribution/server/server.py b/llama_stack/distribution/server/server.py index 8f24f3eaf..daaf8475b 100644 --- a/llama_stack/distribution/server/server.py +++ b/llama_stack/distribution/server/server.py @@ -28,14 +28,9 @@ from pydantic import BaseModel, ValidationError from termcolor import cprint from typing_extensions import Annotated -from llama_stack.distribution.distribution import builtin_automatically_routed_apis +from llama_stack.distribution.datatypes import StackRunConfig -from llama_stack.providers.utils.telemetry.tracing import ( - end_trace, - setup_logger, - start_trace, -) -from llama_stack.distribution.datatypes import * # noqa: F403 +from llama_stack.distribution.distribution import builtin_automatically_routed_apis from llama_stack.distribution.request_headers import set_request_provider_data from llama_stack.distribution.resolver import InvalidProviderError from llama_stack.distribution.stack import ( @@ -43,11 +38,19 @@ from llama_stack.distribution.stack import ( replace_env_vars, validate_env_pair, ) + +from llama_stack.providers.datatypes import Api from llama_stack.providers.inline.telemetry.meta_reference.config import TelemetryConfig from llama_stack.providers.inline.telemetry.meta_reference.telemetry import ( TelemetryAdapter, ) +from llama_stack.providers.utils.telemetry.tracing import ( + end_trace, + setup_logger, + start_trace, +) + from .endpoints import get_all_api_endpoints diff --git a/llama_stack/distribution/stack.py b/llama_stack/distribution/stack.py index f5180b0db..965df5f03 100644 --- a/llama_stack/distribution/stack.py +++ b/llama_stack/distribution/stack.py @@ -8,32 +8,31 @@ import logging import os import re from pathlib import Path -from typing import Any, Dict +from typing import Any, Dict, Optional import pkg_resources import yaml from termcolor import colored -from llama_models.llama3.api.datatypes import * # noqa: F403 -from llama_stack.apis.agents import * # noqa: F403 -from llama_stack.apis.datasets import * # noqa: F403 -from llama_stack.apis.datasetio import * # noqa: F403 -from llama_stack.apis.scoring import * # noqa: F403 -from llama_stack.apis.scoring_functions import * # noqa: F403 -from llama_stack.apis.eval import * # noqa: F403 -from llama_stack.apis.inference import * # noqa: F403 -from llama_stack.apis.batch_inference import * # noqa: F403 -from llama_stack.apis.memory import * # noqa: F403 -from llama_stack.apis.telemetry import * # noqa: F403 -from llama_stack.apis.post_training import * # noqa: F403 -from llama_stack.apis.synthetic_data_generation import * # noqa: F403 -from llama_stack.apis.safety import * # noqa: F403 -from llama_stack.apis.models import * # noqa: F403 -from llama_stack.apis.memory_banks import * # noqa: F403 -from llama_stack.apis.shields import * # noqa: F403 -from llama_stack.apis.inspect import * # noqa: F403 -from llama_stack.apis.eval_tasks import * # noqa: F403 +from llama_stack.apis.agents import Agents +from llama_stack.apis.batch_inference import BatchInference +from llama_stack.apis.datasetio import DatasetIO +from llama_stack.apis.datasets import Datasets +from llama_stack.apis.eval import Eval +from llama_stack.apis.eval_tasks import EvalTasks +from llama_stack.apis.inference import Inference +from llama_stack.apis.inspect import Inspect +from llama_stack.apis.memory import Memory +from llama_stack.apis.memory_banks import MemoryBanks +from llama_stack.apis.models import Models +from llama_stack.apis.post_training import PostTraining +from llama_stack.apis.safety import Safety +from llama_stack.apis.scoring import Scoring +from llama_stack.apis.scoring_functions import ScoringFunctions +from llama_stack.apis.shields import Shields +from llama_stack.apis.synthetic_data_generation import SyntheticDataGeneration +from llama_stack.apis.telemetry import Telemetry from llama_stack.distribution.datatypes import StackRunConfig from llama_stack.distribution.distribution import get_provider_registry diff --git a/llama_stack/distribution/store/registry.py b/llama_stack/distribution/store/registry.py index f98c14443..686054dd2 100644 --- a/llama_stack/distribution/store/registry.py +++ b/llama_stack/distribution/store/registry.py @@ -13,11 +13,8 @@ import pydantic from llama_stack.distribution.datatypes import KVStoreConfig, RoutableObjectWithProvider from llama_stack.distribution.utils.config_dirs import DISTRIBS_BASE_DIR -from llama_stack.providers.utils.kvstore import ( - KVStore, - kvstore_impl, - SqliteKVStoreConfig, -) +from llama_stack.providers.utils.kvstore import KVStore, kvstore_impl +from llama_stack.providers.utils.kvstore.config import SqliteKVStoreConfig class DistributionRegistry(Protocol): diff --git a/llama_stack/distribution/store/tests/test_registry.py b/llama_stack/distribution/store/tests/test_registry.py index 7e389cccd..54bc04f9c 100644 --- a/llama_stack/distribution/store/tests/test_registry.py +++ b/llama_stack/distribution/store/tests/test_registry.py @@ -8,11 +8,14 @@ import os import pytest import pytest_asyncio -from llama_stack.distribution.store import * # noqa F403 from llama_stack.apis.inference import Model from llama_stack.apis.memory_banks import VectorMemoryBank + +from llama_stack.distribution.store.registry import ( + CachedDiskDistributionRegistry, + DiskDistributionRegistry, +) from llama_stack.providers.utils.kvstore import kvstore_impl, SqliteKVStoreConfig -from llama_stack.distribution.datatypes import * # noqa F403 @pytest.fixture diff --git a/llama_stack/providers/datatypes.py b/llama_stack/providers/datatypes.py index c506a754c..ce0c9f52e 100644 --- a/llama_stack/providers/datatypes.py +++ b/llama_stack/providers/datatypes.py @@ -17,6 +17,7 @@ from llama_stack.apis.memory_banks.memory_banks import MemoryBank from llama_stack.apis.models import Model from llama_stack.apis.scoring_functions import ScoringFn from llama_stack.apis.shields import Shield +from llama_stack.apis.tools import Tool @json_schema_type @@ -29,6 +30,7 @@ class Api(Enum): scoring = "scoring" eval = "eval" post_training = "post_training" + tool_runtime = "tool_runtime" telemetry = "telemetry" @@ -38,6 +40,7 @@ class Api(Enum): datasets = "datasets" scoring_functions = "scoring_functions" eval_tasks = "eval_tasks" + tool_groups = "tool_groups" # built-in API inspect = "inspect" @@ -75,6 +78,12 @@ class EvalTasksProtocolPrivate(Protocol): async def register_eval_task(self, eval_task: EvalTask) -> None: ... +class ToolsProtocolPrivate(Protocol): + async def register_tool(self, tool: Tool) -> None: ... + + async def unregister_tool(self, tool_id: str) -> None: ... + + @json_schema_type class ProviderSpec(BaseModel): api: Api diff --git a/llama_stack/providers/inline/agents/meta_reference/agent_instance.py b/llama_stack/providers/inline/agents/meta_reference/agent_instance.py index d7930550d..f225f5393 100644 --- a/llama_stack/providers/inline/agents/meta_reference/agent_instance.py +++ b/llama_stack/providers/inline/agents/meta_reference/agent_instance.py @@ -13,19 +13,64 @@ import secrets import string import uuid from datetime import datetime -from typing import AsyncGenerator, List, Tuple +from typing import AsyncGenerator, Dict, List, Optional, Tuple from urllib.parse import urlparse import httpx +from llama_models.llama3.api.datatypes import BuiltinTool -from llama_stack.apis.agents import * # noqa: F403 -from llama_stack.apis.inference import * # noqa: F403 -from llama_stack.apis.memory import * # noqa: F403 -from llama_stack.apis.memory_banks import * # noqa: F403 -from llama_stack.apis.safety import * # noqa: F403 +from llama_stack.apis.agents import ( + AgentConfig, + AgentTool, + AgentTurnCreateRequest, + AgentTurnResponseEvent, + AgentTurnResponseEventType, + AgentTurnResponseStepCompletePayload, + AgentTurnResponseStepProgressPayload, + AgentTurnResponseStepStartPayload, + AgentTurnResponseStreamChunk, + AgentTurnResponseTurnCompletePayload, + AgentTurnResponseTurnStartPayload, + Attachment, + CodeInterpreterToolDefinition, + FunctionCallToolDefinition, + InferenceStep, + MemoryRetrievalStep, + MemoryToolDefinition, + PhotogenToolDefinition, + SearchToolDefinition, + ShieldCallStep, + StepType, + ToolExecutionStep, + Turn, + WolframAlphaToolDefinition, +) -from llama_stack.apis.common.content_types import InterleavedContent, TextContentItem +from llama_stack.apis.common.content_types import ( + InterleavedContent, + TextContentItem, + URL, +) +from llama_stack.apis.inference import ( + ChatCompletionResponseEventType, + CompletionMessage, + Inference, + Message, + SamplingParams, + StopReason, + SystemMessage, + ToolCallDelta, + ToolCallParseStatus, + ToolChoice, + ToolDefinition, + ToolResponse, + ToolResponseMessage, + UserMessage, +) +from llama_stack.apis.memory import Memory, MemoryBankDocument, QueryDocumentsResponse +from llama_stack.apis.memory_banks import MemoryBanks, VectorMemoryBankParams +from llama_stack.apis.safety import Safety from llama_stack.providers.utils.kvstore import KVStore from llama_stack.providers.utils.memory.vector_store import concat_interleaved_content diff --git a/llama_stack/providers/inline/agents/meta_reference/agents.py b/llama_stack/providers/inline/agents/meta_reference/agents.py index dec5ec960..93bfab5f4 100644 --- a/llama_stack/providers/inline/agents/meta_reference/agents.py +++ b/llama_stack/providers/inline/agents/meta_reference/agents.py @@ -9,15 +9,26 @@ import logging import shutil import tempfile import uuid -from typing import AsyncGenerator +from typing import AsyncGenerator, List, Optional, Union from termcolor import colored -from llama_stack.apis.inference import Inference +from llama_stack.apis.agents import ( + AgentConfig, + AgentCreateResponse, + Agents, + AgentSessionCreateResponse, + AgentStepResponse, + AgentTurnCreateRequest, + Attachment, + Session, + Turn, +) + +from llama_stack.apis.inference import Inference, ToolResponseMessage, UserMessage from llama_stack.apis.memory import Memory from llama_stack.apis.memory_banks import MemoryBanks from llama_stack.apis.safety import Safety -from llama_stack.apis.agents import * # noqa: F403 from llama_stack.providers.utils.kvstore import InmemoryKVStoreImpl, kvstore_impl diff --git a/llama_stack/providers/inline/agents/meta_reference/persistence.py b/llama_stack/providers/inline/agents/meta_reference/persistence.py index 1c99e3d75..a4b1af616 100644 --- a/llama_stack/providers/inline/agents/meta_reference/persistence.py +++ b/llama_stack/providers/inline/agents/meta_reference/persistence.py @@ -10,9 +10,11 @@ import uuid from datetime import datetime from typing import List, Optional -from llama_stack.apis.agents import * # noqa: F403 + from pydantic import BaseModel +from llama_stack.apis.agents import Turn + from llama_stack.providers.utils.kvstore import KVStore log = logging.getLogger(__name__) diff --git a/llama_stack/providers/inline/agents/meta_reference/rag/context_retriever.py b/llama_stack/providers/inline/agents/meta_reference/rag/context_retriever.py index 7b5c8b4b0..74eb91c53 100644 --- a/llama_stack/providers/inline/agents/meta_reference/rag/context_retriever.py +++ b/llama_stack/providers/inline/agents/meta_reference/rag/context_retriever.py @@ -7,8 +7,6 @@ from typing import List from jinja2 import Template -from llama_models.llama3.api import * # noqa: F403 - from llama_stack.apis.agents import ( DefaultMemoryQueryGeneratorConfig, @@ -16,7 +14,7 @@ from llama_stack.apis.agents import ( MemoryQueryGenerator, MemoryQueryGeneratorConfig, ) -from llama_stack.apis.inference import * # noqa: F403 +from llama_stack.apis.inference import Message, UserMessage from llama_stack.providers.utils.inference.prompt_adapter import ( interleaved_content_as_str, ) diff --git a/llama_stack/providers/inline/agents/meta_reference/safety.py b/llama_stack/providers/inline/agents/meta_reference/safety.py index 8fca4d310..90d193f90 100644 --- a/llama_stack/providers/inline/agents/meta_reference/safety.py +++ b/llama_stack/providers/inline/agents/meta_reference/safety.py @@ -9,7 +9,9 @@ import logging from typing import List -from llama_stack.apis.safety import * # noqa: F403 +from llama_stack.apis.inference import Message + +from llama_stack.apis.safety import Safety, SafetyViolation, ViolationLevel log = logging.getLogger(__name__) diff --git a/llama_stack/providers/inline/agents/meta_reference/tests/test_chat_agent.py b/llama_stack/providers/inline/agents/meta_reference/tests/test_chat_agent.py index 6edef0672..035054320 100644 --- a/llama_stack/providers/inline/agents/meta_reference/tests/test_chat_agent.py +++ b/llama_stack/providers/inline/agents/meta_reference/tests/test_chat_agent.py @@ -8,10 +8,26 @@ from typing import AsyncIterator, List, Optional, Union import pytest -from llama_stack.apis.inference import * # noqa: F403 -from llama_stack.apis.memory import * # noqa: F403 -from llama_stack.apis.safety import * # noqa: F403 -from llama_stack.apis.agents import * # noqa: F403 +from llama_stack.apis.agents import ( + AgentConfig, + AgentTurnCreateRequest, + AgentTurnResponseTurnCompletePayload, +) + +from llama_stack.apis.inference import ( + ChatCompletionResponse, + ChatCompletionResponseEvent, + ChatCompletionResponseStreamChunk, + CompletionMessage, + Message, + ResponseFormat, + SamplingParams, + ToolChoice, + ToolDefinition, + UserMessage, +) +from llama_stack.apis.memory import MemoryBank +from llama_stack.apis.safety import RunShieldResponse from ..agents import ( AGENT_INSTANCES_BY_ID, diff --git a/llama_stack/providers/inline/agents/meta_reference/tools/safety.py b/llama_stack/providers/inline/agents/meta_reference/tools/safety.py index 1ffc99edd..a34649756 100644 --- a/llama_stack/providers/inline/agents/meta_reference/tools/safety.py +++ b/llama_stack/providers/inline/agents/meta_reference/tools/safety.py @@ -7,7 +7,7 @@ from typing import List from llama_stack.apis.inference import Message -from llama_stack.apis.safety import * # noqa: F403 +from llama_stack.apis.safety import Safety from ..safety import ShieldRunnerMixin from .builtin import BaseTool diff --git a/llama_stack/providers/inline/datasetio/localfs/config.py b/llama_stack/providers/inline/datasetio/localfs/config.py index 58d563c99..1b89df63b 100644 --- a/llama_stack/providers/inline/datasetio/localfs/config.py +++ b/llama_stack/providers/inline/datasetio/localfs/config.py @@ -3,7 +3,7 @@ # # This source code is licensed under the terms described in the LICENSE file in # the root directory of this source tree. -from llama_stack.apis.datasetio import * # noqa: F401, F403 +from pydantic import BaseModel class LocalFSDatasetIOConfig(BaseModel): ... diff --git a/llama_stack/providers/inline/datasetio/localfs/datasetio.py b/llama_stack/providers/inline/datasetio/localfs/datasetio.py index 736e5d8b9..442053fb3 100644 --- a/llama_stack/providers/inline/datasetio/localfs/datasetio.py +++ b/llama_stack/providers/inline/datasetio/localfs/datasetio.py @@ -3,18 +3,19 @@ # # This source code is licensed under the terms described in the LICENSE file in # the root directory of this source tree. -from typing import Any, Dict, List, Optional - -import pandas -from llama_models.llama3.api.datatypes import * # noqa: F403 - -from llama_stack.apis.datasetio import * # noqa: F403 import base64 import os from abc import ABC, abstractmethod from dataclasses import dataclass +from typing import Any, Dict, List, Optional from urllib.parse import urlparse +import pandas + +from llama_stack.apis.common.content_types import URL +from llama_stack.apis.datasetio import DatasetIO, PaginatedRowsResult +from llama_stack.apis.datasets import Dataset + from llama_stack.providers.datatypes import DatasetsProtocolPrivate from llama_stack.providers.utils.datasetio.url_utils import get_dataframe_from_url diff --git a/llama_stack/providers/inline/eval/meta_reference/eval.py b/llama_stack/providers/inline/eval/meta_reference/eval.py index e1af29e12..65ecd8280 100644 --- a/llama_stack/providers/inline/eval/meta_reference/eval.py +++ b/llama_stack/providers/inline/eval/meta_reference/eval.py @@ -8,6 +8,11 @@ from typing import Any, Dict, List, Optional from tqdm import tqdm from llama_stack.apis.agents import Agents +from llama_stack.apis.common.type_system import ( + ChatCompletionInputType, + CompletionInputType, + StringType, +) from llama_stack.apis.datasetio import DatasetIO from llama_stack.apis.datasets import Datasets from llama_stack.apis.eval_tasks import EvalTask diff --git a/llama_stack/providers/inline/inference/meta_reference/config.py b/llama_stack/providers/inline/inference/meta_reference/config.py index 33af33fcd..2c46ef596 100644 --- a/llama_stack/providers/inline/inference/meta_reference/config.py +++ b/llama_stack/providers/inline/inference/meta_reference/config.py @@ -6,11 +6,10 @@ from typing import Any, Dict, Optional -from llama_models.datatypes import * # noqa: F403 - -from llama_stack.apis.inference import * # noqa: F401, F403 from pydantic import BaseModel, field_validator +from llama_stack.apis.inference import QuantizationConfig + from llama_stack.providers.utils.inference import supported_inference_models diff --git a/llama_stack/providers/inline/inference/meta_reference/generation.py b/llama_stack/providers/inline/inference/meta_reference/generation.py index c89183cb7..1807e4ad5 100644 --- a/llama_stack/providers/inline/inference/meta_reference/generation.py +++ b/llama_stack/providers/inline/inference/meta_reference/generation.py @@ -32,11 +32,16 @@ from llama_models.llama3.reference_impl.multimodal.model import ( CrossAttentionTransformer, ) from llama_models.sku_list import resolve_model -from pydantic import BaseModel - -from llama_stack.apis.inference import * # noqa: F403 from lmformatenforcer import JsonSchemaParser, TokenEnforcer, TokenEnforcerTokenizerData +from pydantic import BaseModel + +from llama_stack.apis.inference import ( + Fp8QuantizationConfig, + Int4QuantizationConfig, + ResponseFormat, + ResponseFormatType, +) from llama_stack.distribution.utils.model_utils import model_local_dir from llama_stack.providers.utils.inference.prompt_adapter import ( @@ -44,12 +49,7 @@ from llama_stack.providers.utils.inference.prompt_adapter import ( CompletionRequestWithRawContent, ) -from .config import ( - Fp8QuantizationConfig, - Int4QuantizationConfig, - MetaReferenceInferenceConfig, - MetaReferenceQuantizedInferenceConfig, -) +from .config import MetaReferenceInferenceConfig, MetaReferenceQuantizedInferenceConfig log = logging.getLogger(__name__) diff --git a/llama_stack/providers/inline/inference/meta_reference/model_parallel.py b/llama_stack/providers/inline/inference/meta_reference/model_parallel.py index cb422b9b6..97384f4bb 100644 --- a/llama_stack/providers/inline/inference/meta_reference/model_parallel.py +++ b/llama_stack/providers/inline/inference/meta_reference/model_parallel.py @@ -14,7 +14,10 @@ from llama_models.llama3.api.datatypes import Model from llama_models.llama3.api.tokenizer import Tokenizer from llama_models.sku_list import resolve_model -from llama_stack.apis.inference import ChatCompletionRequest, CompletionRequest +from llama_stack.providers.utils.inference.prompt_adapter import ( + ChatCompletionRequestWithRawContent, + CompletionRequestWithRawContent, +) from .config import MetaReferenceInferenceConfig from .generation import Llama, model_checkpoint_dir @@ -27,9 +30,9 @@ class ModelRunner: # the `task` object is the same that is sent to `ModelParallelProcessGroup.run_inference()` def __call__(self, req: Any): - if isinstance(req, ChatCompletionRequest): + if isinstance(req, ChatCompletionRequestWithRawContent): return self.llama.chat_completion(req) - elif isinstance(req, CompletionRequest): + elif isinstance(req, CompletionRequestWithRawContent): return self.llama.completion(req) else: raise ValueError(f"Unexpected task type {type(req)}") @@ -100,7 +103,7 @@ class LlamaModelParallelGenerator: def completion( self, - request: CompletionRequest, + request: CompletionRequestWithRawContent, ) -> Generator: req_obj = deepcopy(request) gen = self.group.run_inference(req_obj) @@ -108,7 +111,7 @@ class LlamaModelParallelGenerator: def chat_completion( self, - request: ChatCompletionRequest, + request: ChatCompletionRequestWithRawContent, ) -> Generator: req_obj = deepcopy(request) gen = self.group.run_inference(req_obj) diff --git a/llama_stack/providers/inline/inference/meta_reference/parallel_utils.py b/llama_stack/providers/inline/inference/meta_reference/parallel_utils.py index 830160578..36720612c 100644 --- a/llama_stack/providers/inline/inference/meta_reference/parallel_utils.py +++ b/llama_stack/providers/inline/inference/meta_reference/parallel_utils.py @@ -34,7 +34,10 @@ from pydantic import BaseModel, Field from torch.distributed.launcher.api import elastic_launch, LaunchConfig from typing_extensions import Annotated -from llama_stack.apis.inference import ChatCompletionRequest, CompletionRequest +from llama_stack.providers.utils.inference.prompt_adapter import ( + ChatCompletionRequestWithRawContent, + CompletionRequestWithRawContent, +) from .generation import TokenResult @@ -79,7 +82,7 @@ class TaskRequest(BaseModel): type: Literal[ProcessingMessageName.task_request] = ( ProcessingMessageName.task_request ) - task: Union[CompletionRequest, ChatCompletionRequest] + task: Union[CompletionRequestWithRawContent, ChatCompletionRequestWithRawContent] class TaskResponse(BaseModel): @@ -264,9 +267,6 @@ def launch_dist_group( init_model_cb: Callable, **kwargs, ) -> None: - id = uuid.uuid4().hex - dist_url = f"file:///tmp/llama3_{id}_{time.time()}" - with tempfile.TemporaryDirectory() as tmpdir: # TODO: track workers and if they terminate, tell parent process about it so cleanup can happen launch_config = LaunchConfig( @@ -315,7 +315,7 @@ def start_model_parallel_process( # wait until the model is loaded; rank 0 will send a message to indicate it's ready request_socket.send(encode_msg(ReadyRequest())) - response = request_socket.recv() + _response = request_socket.recv() log.info("Loaded model...") return request_socket, process @@ -349,7 +349,10 @@ class ModelParallelProcessGroup: self.started = False def run_inference( - self, req: Union[CompletionRequest, ChatCompletionRequest] + self, + req: Union[ + CompletionRequestWithRawContent, ChatCompletionRequestWithRawContent + ], ) -> Generator: assert not self.running, "inference already running" diff --git a/llama_stack/providers/inline/inference/vllm/vllm.py b/llama_stack/providers/inline/inference/vllm/vllm.py index c5925774b..73f7adecd 100644 --- a/llama_stack/providers/inline/inference/vllm/vllm.py +++ b/llama_stack/providers/inline/inference/vllm/vllm.py @@ -7,10 +7,10 @@ import logging import os import uuid -from typing import AsyncGenerator, Optional +from typing import AsyncGenerator, List, Optional from llama_models.llama3.api.chat_format import ChatFormat -from llama_models.llama3.api.datatypes import * # noqa: F403 + from llama_models.llama3.api.tokenizer import Tokenizer from llama_models.sku_list import resolve_model @@ -18,9 +18,26 @@ from vllm.engine.arg_utils import AsyncEngineArgs from vllm.engine.async_llm_engine import AsyncLLMEngine from vllm.sampling_params import SamplingParams as VLLMSamplingParams -from llama_stack.apis.inference import * # noqa: F403 +from llama_stack.apis.common.content_types import InterleavedContent +from llama_stack.apis.inference import ( + ChatCompletionRequest, + ChatCompletionResponse, + ChatCompletionResponseStreamChunk, + CompletionResponse, + CompletionResponseStreamChunk, + EmbeddingsResponse, + Inference, + LogProbConfig, + Message, + ResponseFormat, + SamplingParams, + ToolChoice, + ToolDefinition, + ToolPromptFormat, +) +from llama_stack.apis.models import Model -from llama_stack.providers.datatypes import Model, ModelsProtocolPrivate +from llama_stack.providers.datatypes import ModelsProtocolPrivate from llama_stack.providers.utils.inference.openai_compat import ( OpenAICompatCompletionChoice, OpenAICompatCompletionResponse, diff --git a/llama_stack/providers/inline/memory/faiss/faiss.py b/llama_stack/providers/inline/memory/faiss/faiss.py index a46b151d9..af398801a 100644 --- a/llama_stack/providers/inline/memory/faiss/faiss.py +++ b/llama_stack/providers/inline/memory/faiss/faiss.py @@ -16,11 +16,14 @@ import faiss import numpy as np from numpy.typing import NDArray -from llama_models.llama3.api.datatypes import * # noqa: F403 - -from llama_stack.apis.memory import * # noqa: F403 from llama_stack.apis.inference import InterleavedContent -from llama_stack.apis.memory_banks import MemoryBankType, VectorMemoryBank +from llama_stack.apis.memory import ( + Chunk, + Memory, + MemoryBankDocument, + QueryDocumentsResponse, +) +from llama_stack.apis.memory_banks import MemoryBank, MemoryBankType, VectorMemoryBank from llama_stack.providers.datatypes import Api, MemoryBanksProtocolPrivate from llama_stack.providers.utils.kvstore import kvstore_impl from llama_stack.providers.utils.memory.vector_store import ( diff --git a/llama_stack/providers/inline/post_training/torchtune/common/utils.py b/llama_stack/providers/inline/post_training/torchtune/common/utils.py index 462cbc21e..f2a2edae5 100644 --- a/llama_stack/providers/inline/post_training/torchtune/common/utils.py +++ b/llama_stack/providers/inline/post_training/torchtune/common/utils.py @@ -14,11 +14,10 @@ from enum import Enum from typing import Any, Callable, Dict, List import torch -from llama_stack.apis.datasets import Datasets -from llama_stack.apis.common.type_system import * # noqa from llama_models.datatypes import Model from llama_models.sku_list import resolve_model -from llama_stack.apis.common.type_system import ParamType +from llama_stack.apis.common.type_system import ParamType, StringType +from llama_stack.apis.datasets import Datasets from torchtune.models.llama3 import llama3_tokenizer, lora_llama3_8b from torchtune.models.llama3._tokenizer import Llama3Tokenizer diff --git a/llama_stack/providers/inline/post_training/torchtune/post_training.py b/llama_stack/providers/inline/post_training/torchtune/post_training.py index 9b1269f16..90fbf7026 100644 --- a/llama_stack/providers/inline/post_training/torchtune/post_training.py +++ b/llama_stack/providers/inline/post_training/torchtune/post_training.py @@ -3,11 +3,26 @@ # # This source code is licensed under the terms described in the LICENSE file in # the root directory of this source tree. +from datetime import datetime +from typing import Any, Dict, List, Optional + +from llama_models.schema_utils import webmethod + from llama_stack.apis.datasetio import DatasetIO +from llama_stack.apis.datasets import Datasets +from llama_stack.apis.post_training import ( + AlgorithmConfig, + DPOAlignmentConfig, + JobStatus, + LoraFinetuningConfig, + PostTrainingJob, + PostTrainingJobArtifactsResponse, + PostTrainingJobStatusResponse, + TrainingConfig, +) from llama_stack.providers.inline.post_training.torchtune.config import ( TorchtunePostTrainingConfig, ) -from llama_stack.apis.post_training import * # noqa from llama_stack.providers.inline.post_training.torchtune.recipes.lora_finetuning_single_device import ( LoraFinetuningSingleDevice, ) diff --git a/llama_stack/providers/inline/post_training/torchtune/recipes/lora_finetuning_single_device.py b/llama_stack/providers/inline/post_training/torchtune/recipes/lora_finetuning_single_device.py index 7f1547657..517be6d89 100644 --- a/llama_stack/providers/inline/post_training/torchtune/recipes/lora_finetuning_single_device.py +++ b/llama_stack/providers/inline/post_training/torchtune/recipes/lora_finetuning_single_device.py @@ -14,27 +14,33 @@ from typing import Any, Dict, List, Optional, Tuple import torch from llama_models.sku_list import resolve_model +from llama_stack.apis.common.training_types import PostTrainingMetric from llama_stack.apis.datasetio import DatasetIO +from llama_stack.apis.datasets import Datasets +from llama_stack.apis.post_training import ( + AlgorithmConfig, + Checkpoint, + LoraFinetuningConfig, + OptimizerConfig, + TrainingConfig, +) from llama_stack.distribution.utils.config_dirs import DEFAULT_CHECKPOINT_DIR -from llama_stack.providers.inline.post_training.torchtune.common.checkpointer import ( - TorchtuneCheckpointer, -) -from torch import nn -from torchtune import utils as torchtune_utils -from torchtune.training.metric_logging import DiskLogger -from tqdm import tqdm -from llama_stack.apis.post_training import * # noqa + from llama_stack.distribution.utils.model_utils import model_local_dir from llama_stack.providers.inline.post_training.torchtune.common import utils +from llama_stack.providers.inline.post_training.torchtune.common.checkpointer import ( + TorchtuneCheckpointer, +) from llama_stack.providers.inline.post_training.torchtune.config import ( TorchtunePostTrainingConfig, ) from llama_stack.providers.inline.post_training.torchtune.datasets.sft import SFTDataset +from torch import nn from torch.optim import Optimizer from torch.utils.data import DataLoader, DistributedSampler -from torchtune import modules, training +from torchtune import modules, training, utils as torchtune_utils from torchtune.data import AlpacaToMessages, padded_collate_sft from torchtune.modules.loss import CEWithChunkedOutputLoss @@ -43,11 +49,12 @@ from torchtune.modules.peft import ( get_adapter_state_dict, get_lora_module_names, get_merged_lora_ckpt, - load_dora_magnitudes, set_trainable_params, validate_missing_and_unexpected_for_lora, ) from torchtune.training.lr_schedulers import get_cosine_schedule_with_warmup +from torchtune.training.metric_logging import DiskLogger +from tqdm import tqdm log = logging.getLogger(__name__) @@ -110,6 +117,10 @@ class LoraFinetuningSingleDevice: self.checkpoint_dir = config.checkpoint_dir else: model = resolve_model(self.model_id) + if model is None: + raise ValueError( + f"{self.model_id} not found. Your model id should be in the llama models SKU list" + ) self.checkpoint_dir = model_checkpoint_dir(model) self._output_dir = str(DEFAULT_CHECKPOINT_DIR) @@ -277,7 +288,6 @@ class LoraFinetuningSingleDevice: for m in model.modules(): if hasattr(m, "initialize_dora_magnitude"): m.initialize_dora_magnitude() - load_dora_magnitudes(model) if lora_weights_state_dict: lora_missing, lora_unexpected = model.load_state_dict( lora_weights_state_dict, strict=False diff --git a/llama_stack/providers/inline/safety/code_scanner/code_scanner.py b/llama_stack/providers/inline/safety/code_scanner/code_scanner.py index 46b5e57da..87d68f74c 100644 --- a/llama_stack/providers/inline/safety/code_scanner/code_scanner.py +++ b/llama_stack/providers/inline/safety/code_scanner/code_scanner.py @@ -7,8 +7,14 @@ import logging from typing import Any, Dict, List -from llama_stack.apis.safety import * # noqa: F403 from llama_stack.apis.inference import Message +from llama_stack.apis.safety import ( + RunShieldResponse, + Safety, + SafetyViolation, + ViolationLevel, +) +from llama_stack.apis.shields import Shield from llama_stack.providers.utils.inference.prompt_adapter import ( interleaved_content_as_str, ) diff --git a/llama_stack/providers/inline/safety/llama_guard/llama_guard.py b/llama_stack/providers/inline/safety/llama_guard/llama_guard.py index bbdd5c3df..00213ac83 100644 --- a/llama_stack/providers/inline/safety/llama_guard/llama_guard.py +++ b/llama_stack/providers/inline/safety/llama_guard/llama_guard.py @@ -9,10 +9,24 @@ import re from string import Template from typing import Any, Dict, List, Optional -from llama_models.llama3.api.datatypes import * # noqa: F403 -from llama_stack.apis.inference import * # noqa: F403 -from llama_stack.apis.safety import * # noqa: F403 +from llama_models.datatypes import CoreModelId +from llama_models.llama3.api.datatypes import Role + from llama_stack.apis.common.content_types import ImageContentItem, TextContentItem +from llama_stack.apis.inference import ( + ChatCompletionResponseEventType, + Inference, + Message, + UserMessage, +) +from llama_stack.apis.safety import ( + RunShieldResponse, + Safety, + SafetyViolation, + ViolationLevel, +) + +from llama_stack.apis.shields import Shield from llama_stack.distribution.datatypes import Api from llama_stack.providers.datatypes import ShieldsProtocolPrivate diff --git a/llama_stack/providers/inline/safety/prompt_guard/prompt_guard.py b/llama_stack/providers/inline/safety/prompt_guard/prompt_guard.py index 4cb34127f..3f30645bd 100644 --- a/llama_stack/providers/inline/safety/prompt_guard/prompt_guard.py +++ b/llama_stack/providers/inline/safety/prompt_guard/prompt_guard.py @@ -11,11 +11,16 @@ import torch from transformers import AutoModelForSequenceClassification, AutoTokenizer -from llama_stack.distribution.utils.model_utils import model_local_dir -from llama_stack.apis.inference import * # noqa: F403 -from llama_stack.apis.safety import * # noqa: F403 -from llama_models.llama3.api.datatypes import * # noqa: F403 +from llama_stack.apis.inference import Message +from llama_stack.apis.safety import ( + RunShieldResponse, + Safety, + SafetyViolation, + ViolationLevel, +) +from llama_stack.apis.shields import Shield +from llama_stack.distribution.utils.model_utils import model_local_dir from llama_stack.providers.datatypes import ShieldsProtocolPrivate from llama_stack.providers.utils.inference.prompt_adapter import ( interleaved_content_as_str, diff --git a/llama_stack/providers/inline/scoring/basic/scoring.py b/llama_stack/providers/inline/scoring/basic/scoring.py index 921ad4b26..ad1f159c8 100644 --- a/llama_stack/providers/inline/scoring/basic/scoring.py +++ b/llama_stack/providers/inline/scoring/basic/scoring.py @@ -3,14 +3,17 @@ # # This source code is licensed under the terms described in the LICENSE file in # the root directory of this source tree. -from typing import List +from typing import Any, Dict, List, Optional -from llama_models.llama3.api.datatypes import * # noqa: F403 -from llama_stack.apis.scoring import * # noqa: F403 -from llama_stack.apis.scoring_functions import * # noqa: F403 -from llama_stack.apis.common.type_system import * # noqa: F403 -from llama_stack.apis.datasetio import * # noqa: F403 -from llama_stack.apis.datasets import * # noqa: F403 +from llama_stack.apis.datasetio import DatasetIO +from llama_stack.apis.datasets import Datasets +from llama_stack.apis.scoring import ( + ScoreBatchResponse, + ScoreResponse, + Scoring, + ScoringResult, +) +from llama_stack.apis.scoring_functions import ScoringFn, ScoringFnParams from llama_stack.providers.datatypes import ScoringFunctionsProtocolPrivate from llama_stack.providers.utils.common.data_schema_validator_mixin import ( DataSchemaValidatorMixin, diff --git a/llama_stack/providers/inline/scoring/braintrust/braintrust.py b/llama_stack/providers/inline/scoring/braintrust/braintrust.py index 53e3e7ea6..56fb26ef2 100644 --- a/llama_stack/providers/inline/scoring/braintrust/braintrust.py +++ b/llama_stack/providers/inline/scoring/braintrust/braintrust.py @@ -3,21 +3,24 @@ # # This source code is licensed under the terms described in the LICENSE file in # the root directory of this source tree. -from typing import List - -from llama_models.llama3.api.datatypes import * # noqa: F403 -from llama_stack.apis.scoring import * # noqa: F403 -from llama_stack.apis.scoring_functions import * # noqa: F403 -from llama_stack.apis.common.type_system import * # noqa: F403 -from llama_stack.apis.datasetio import * # noqa: F403 -from llama_stack.apis.datasets import * # noqa: F403 - import os +from typing import Any, Dict, List, Optional from autoevals.llm import Factuality from autoevals.ragas import AnswerCorrectness from pydantic import BaseModel +from llama_stack.apis.datasetio import DatasetIO +from llama_stack.apis.datasets import Datasets +from llama_stack.apis.scoring import ( + ScoreBatchResponse, + ScoreResponse, + Scoring, + ScoringResult, + ScoringResultRow, +) +from llama_stack.apis.scoring_functions import AggregationFunctionType, ScoringFn + from llama_stack.distribution.request_headers import NeedsRequestProviderData from llama_stack.providers.datatypes import ScoringFunctionsProtocolPrivate from llama_stack.providers.utils.common.data_schema_validator_mixin import ( diff --git a/llama_stack/providers/inline/scoring/braintrust/config.py b/llama_stack/providers/inline/scoring/braintrust/config.py index e12249432..d4e0d9bcd 100644 --- a/llama_stack/providers/inline/scoring/braintrust/config.py +++ b/llama_stack/providers/inline/scoring/braintrust/config.py @@ -3,7 +3,9 @@ # # This source code is licensed under the terms described in the LICENSE file in # the root directory of this source tree. -from llama_stack.apis.scoring import * # noqa: F401, F403 +from typing import Any, Dict, Optional + +from pydantic import BaseModel, Field class BraintrustScoringConfig(BaseModel): diff --git a/llama_stack/providers/inline/telemetry/meta_reference/telemetry.py b/llama_stack/providers/inline/telemetry/meta_reference/telemetry.py index d7229f508..81dd9910d 100644 --- a/llama_stack/providers/inline/telemetry/meta_reference/telemetry.py +++ b/llama_stack/providers/inline/telemetry/meta_reference/telemetry.py @@ -17,6 +17,22 @@ from opentelemetry.sdk.trace import TracerProvider from opentelemetry.sdk.trace.export import BatchSpanProcessor from opentelemetry.semconv.resource import ResourceAttributes +from llama_stack.apis.telemetry import ( + Event, + MetricEvent, + QueryCondition, + SpanEndPayload, + SpanStartPayload, + SpanStatus, + SpanWithStatus, + StructuredLogEvent, + Telemetry, + Trace, + UnstructuredLogEvent, +) + +from llama_stack.distribution.datatypes import Api + from llama_stack.providers.inline.telemetry.meta_reference.console_span_processor import ( ConsoleSpanProcessor, ) @@ -27,10 +43,6 @@ from llama_stack.providers.inline.telemetry.meta_reference.sqlite_span_processor from llama_stack.providers.utils.telemetry.dataset_mixin import TelemetryDatasetMixin from llama_stack.providers.utils.telemetry.sqlite_trace_store import SQLiteTraceStore -from llama_stack.apis.telemetry import * # noqa: F403 - -from llama_stack.distribution.datatypes import Api - from .config import TelemetryConfig, TelemetrySink _GLOBAL_STORAGE = { diff --git a/llama_stack/providers/inline/telemetry/sample/sample.py b/llama_stack/providers/inline/telemetry/sample/sample.py index eaa6d834a..f07a185ef 100644 --- a/llama_stack/providers/inline/telemetry/sample/sample.py +++ b/llama_stack/providers/inline/telemetry/sample/sample.py @@ -4,12 +4,10 @@ # This source code is licensed under the terms described in the LICENSE file in # the root directory of this source tree. +from llama_stack.apis.telemetry import Telemetry from .config import SampleConfig -from llama_stack.apis.telemetry import * # noqa: F403 - - class SampleTelemetryImpl(Telemetry): def __init__(self, config: SampleConfig): self.config = config diff --git a/llama_stack/providers/inline/tool_runtime/brave_search/__init__.py b/llama_stack/providers/inline/tool_runtime/brave_search/__init__.py new file mode 100644 index 000000000..e9f0eeae8 --- /dev/null +++ b/llama_stack/providers/inline/tool_runtime/brave_search/__init__.py @@ -0,0 +1,20 @@ +# Copyright (c) Meta Platforms, Inc. and affiliates. +# All rights reserved. +# +# This source code is licensed under the terms described in the LICENSE file in +# the root directory of this source tree. + +from pydantic import BaseModel + +from .brave_search import BraveSearchToolRuntimeImpl +from .config import BraveSearchToolConfig + + +class BraveSearchToolProviderDataValidator(BaseModel): + api_key: str + + +async def get_provider_impl(config: BraveSearchToolConfig, _deps): + impl = BraveSearchToolRuntimeImpl(config) + await impl.initialize() + return impl diff --git a/llama_stack/providers/inline/tool_runtime/brave_search/brave_search.py b/llama_stack/providers/inline/tool_runtime/brave_search/brave_search.py new file mode 100644 index 000000000..ca0141552 --- /dev/null +++ b/llama_stack/providers/inline/tool_runtime/brave_search/brave_search.py @@ -0,0 +1,123 @@ +# Copyright (c) Meta Platforms, Inc. and affiliates. +# All rights reserved. +# +# This source code is licensed under the terms described in the LICENSE file in +# the root directory of this source tree. + +from typing import Any, Dict, List + +import requests + +from llama_stack.apis.tools import Tool, ToolGroupDef, ToolInvocationResult, ToolRuntime +from llama_stack.distribution.request_headers import NeedsRequestProviderData +from llama_stack.providers.datatypes import ToolsProtocolPrivate + +from .config import BraveSearchToolConfig + + +class BraveSearchToolRuntimeImpl( + ToolsProtocolPrivate, ToolRuntime, NeedsRequestProviderData +): + def __init__(self, config: BraveSearchToolConfig): + self.config = config + + async def initialize(self): + pass + + async def register_tool(self, tool: Tool): + if tool.identifier != "brave_search": + raise ValueError(f"Tool identifier {tool.identifier} is not supported") + + async def unregister_tool(self, tool_id: str) -> None: + return + + def _get_api_key(self) -> str: + if self.config.api_key: + return self.config.api_key + + provider_data = self.get_request_provider_data() + if provider_data is None or not provider_data.api_key: + raise ValueError( + 'Pass Search provider\'s API Key in the header X-LlamaStack-ProviderData as { "api_key":