{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "# Pairwise Embedding Distance \n", "\n", "One way to measure the similarity (or dissimilarity) between two predictions on a shared or similar input is to embed the predictions and compute a vector distance between the two embeddings.[[1]](#cite_note-1)\n", "\n", "You can load the `pairwise_embedding_distance` evaluator to do this.\n", "\n", "**Note:** This returns a **distance** score, meaning that the lower the number, the **more** similar the outputs are, according to their embedded representation.\n", "\n", "Check out the reference docs for the [PairwiseEmbeddingDistanceEvalChain](https://api.python.langchain.com/en/latest/evaluation/langchain.evaluation.embedding_distance.base.PairwiseEmbeddingDistanceEvalChain.html#langchain.evaluation.embedding_distance.base.PairwiseEmbeddingDistanceEvalChain) for more info." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "tags": [] }, "outputs": [], "source": [ "from langchain.evaluation import load_evaluator\n", "\n", "evaluator = load_evaluator(\"pairwise_embedding_distance\")" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "{'score': 0.0966466944859925}" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "evaluator.evaluate_string_pairs(\n", " prediction=\"Seattle is hot in June\", prediction_b=\"Seattle is cool in June.\"\n", ")" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "{'score': 0.03761174337464557}" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "evaluator.evaluate_string_pairs(\n", " prediction=\"Seattle is warm in June\", prediction_b=\"Seattle is cool in June.\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Select the Distance Metric\n", "\n", "By default, the evalutor uses cosine distance. You can choose a different distance metric if you'd like. " ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "[,\n", " ,\n", " ,\n", " ,\n", " ]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from langchain.evaluation import EmbeddingDistance\n", "\n", "list(EmbeddingDistance)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "tags": [] }, "outputs": [], "source": [ "evaluator = load_evaluator(\n", " \"pairwise_embedding_distance\", distance_metric=EmbeddingDistance.EUCLIDEAN\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Select Embeddings to Use\n", "\n", "The constructor uses `OpenAI` embeddings by default, but you can configure this however you want. Below, use huggingface local embeddings" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [] }, "outputs": [], "source": [ "from langchain.embeddings import HuggingFaceEmbeddings\n", "\n", "embedding_model = HuggingFaceEmbeddings()\n", "hf_evaluator = load_evaluator(\"pairwise_embedding_distance\", embeddings=embedding_model)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "{'score': 0.5486443280477362}" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hf_evaluator.evaluate_string_pairs(\n", " prediction=\"Seattle is hot in June\", prediction_b=\"Seattle is cool in June.\"\n", ")" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "{'score': 0.21018880025138598}" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "hf_evaluator.evaluate_string_pairs(\n", " prediction=\"Seattle is warm in June\", prediction_b=\"Seattle is cool in June.\"\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1. Note: When it comes to semantic similarity, this often gives better results than older string distance metrics (such as those in the `PairwiseStringDistanceEvalChain`), though it tends to be less reliable than evaluators that use the LLM directly (such as the `PairwiseStringEvalChain`) " ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.2" } }, "nbformat": 4, "nbformat_minor": 4 }