litellm/docs/my-website/docs/embedding/supported_embedding.md
Krish Dholakia c03e5da41f
LiteLLM Minor Fixes & Improvements (10/24/2024) (#6421)
* fix(utils.py): support passing dynamic api base to validate_environment

Returns True if just api base is required and api base is passed

* fix(litellm_pre_call_utils.py): feature flag sending client headers to llm api

Fixes https://github.com/BerriAI/litellm/issues/6410

* fix(anthropic/chat/transformation.py): return correct error message

* fix(http_handler.py): add error response text in places where we expect it

* fix(factory.py): handle base case of no non-system messages to bedrock

Fixes https://github.com/BerriAI/litellm/issues/6411

* feat(cohere/embed): Support cohere image embeddings

Closes https://github.com/BerriAI/litellm/issues/6413

* fix(__init__.py): fix linting error

* docs(supported_embedding.md): add image embedding example to docs

* feat(cohere/embed): use cohere embedding returned usage for cost calc

* build(model_prices_and_context_window.json): add embed-english-v3.0 details (image cost + 'supports_image_input' flag)

* fix(cohere_transformation.py): fix linting error

* test(test_proxy_server.py): cleanup test

* test: cleanup test

* fix: fix linting errors
2024-10-25 15:55:56 -07:00

16 KiB

import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem';

Embedding Models

Quick Start

from litellm import embedding
import os
os.environ['OPENAI_API_KEY'] = ""
response = embedding(model='text-embedding-ada-002', input=["good morning from litellm"])

Proxy Usage

NOTE For vertex_ai,

export GOOGLE_APPLICATION_CREDENTIALS="absolute/path/to/service_account.json"

Add model to config

model_list:
- model_name: textembedding-gecko
  litellm_params:
    model: vertex_ai/textembedding-gecko

general_settings:
  master_key: sk-1234

Start proxy

litellm --config /path/to/config.yaml 

# RUNNING on http://0.0.0.0:4000

Test

curl --location 'http://0.0.0.0:4000/embeddings' \
--header 'Authorization: Bearer sk-1234' \
--header 'Content-Type: application/json' \
--data '{"input": ["Academia.edu uses"], "model": "textembedding-gecko", "encoding_format": "base64"}'
from openai import OpenAI
client = OpenAI(
  api_key="sk-1234",
  base_url="http://0.0.0.0:4000"
)

client.embeddings.create(
  model="textembedding-gecko",
  input="The food was delicious and the waiter...",
  encoding_format="float"
)
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="textembedding-gecko", openai_api_base="http://0.0.0.0:4000", openai_api_key="sk-1234")

text = "This is a test document."

query_result = embeddings.embed_query(text)

print(f"VERTEX AI EMBEDDINGS")
print(query_result[:5])

Image Embeddings

For models that support image embeddings, you can pass in a base64 encoded image string to the input param.

from litellm import embedding
import os

# set your api key
os.environ["COHERE_API_KEY"] = ""

response = embedding(model="cohere/embed-english-v3.0", input=["<base64 encoded image>"])
  1. Setup config.yaml
model_list:
  - model_name: cohere-embed
    litellm_params:
      model: cohere/embed-english-v3.0
      api_key: os.environ/COHERE_API_KEY
  1. Start proxy
litellm --config /path/to/config.yaml 

# RUNNING on http://0.0.0.0:4000
  1. Test it!
curl -X POST 'http://0.0.0.0:4000/v1/embeddings' \
-H 'Authorization: Bearer sk-54d77cd67b9febbb' \
-H 'Content-Type: application/json' \
-d '{
  "model": "cohere/embed-english-v3.0",
  "input": ["<base64 encoded image>"]
}'

Input Params for litellm.embedding()

:::info

Any non-openai params, will be treated as provider-specific params, and sent in the request body as kwargs to the provider.

See Reserved Params

See Example :::

Required Fields

  • model: string - ID of the model to use. model='text-embedding-ada-002'

  • input: string or array - Input text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings or array of token arrays. The input must not exceed the max input tokens for the model (8192 tokens for text-embedding-ada-002), cannot be an empty string, and any array must be 2048 dimensions or less.

input=["good morning from litellm"]

Optional LiteLLM Fields

  • user: string (optional) A unique identifier representing your end-user,

  • dimensions: integer (Optional) The number of dimensions the resulting output embeddings should have. Only supported in OpenAI/Azure text-embedding-3 and later models.

  • encoding_format: string (Optional) The format to return the embeddings in. Can be either "float" or "base64". Defaults to encoding_format="float"

  • timeout: integer (Optional) - The maximum time, in seconds, to wait for the API to respond. Defaults to 600 seconds (10 minutes).

  • api_base: string (optional) - The api endpoint you want to call the model with

  • api_version: string (optional) - (Azure-specific) the api version for the call

  • api_key: string (optional) - The API key to authenticate and authorize requests. If not provided, the default API key is used.

  • api_type: string (optional) - The type of API to use.

Output from litellm.embedding()

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [
        -0.0022326677571982145,
        0.010749882087111473,
        ...
        ...
        ...
   
      ]
    }
  ],
  "model": "text-embedding-ada-002-v2",
  "usage": {
    "prompt_tokens": 10,
    "total_tokens": 10
  }
}

OpenAI Embedding Models

Usage

from litellm import embedding
import os
os.environ['OPENAI_API_KEY'] = ""
response = embedding(
    model="text-embedding-3-small",
    input=["good morning from litellm", "this is another item"],
    metadata={"anything": "good day"},
    dimensions=5 # Only supported in text-embedding-3 and later models.
)
Model Name Function Call Required OS Variables
text-embedding-3-small embedding('text-embedding-3-small', input) os.environ['OPENAI_API_KEY']
text-embedding-3-large embedding('text-embedding-3-large', input) os.environ['OPENAI_API_KEY']
text-embedding-ada-002 embedding('text-embedding-ada-002', input) os.environ['OPENAI_API_KEY']

Azure OpenAI Embedding Models

API keys

This can be set as env variables or passed as params to litellm.embedding()

import os
os.environ['AZURE_API_KEY'] = 
os.environ['AZURE_API_BASE'] = 
os.environ['AZURE_API_VERSION'] = 

Usage

from litellm import embedding
response = embedding(
    model="azure/<your deployment name>",
    input=["good morning from litellm"],
    api_key=api_key,
    api_base=api_base,
    api_version=api_version,
)
print(response)
Model Name Function Call
text-embedding-ada-002 embedding(model="azure/<your deployment name>", input=input)

h/t to Mikko for this integration

OpenAI Compatible Embedding Models

Use this for calling /embedding endpoints on OpenAI Compatible Servers, example https://github.com/xorbitsai/inference

Note add openai/ prefix to model so litellm knows to route to OpenAI

Usage

from litellm import embedding
response = embedding(
  model = "openai/<your-llm-name>",     # add `openai/` prefix to model so litellm knows to route to OpenAI
  api_base="http://0.0.0.0:4000/"       # set API Base of your Custom OpenAI Endpoint
  input=["good morning from litellm"]
)

Bedrock Embedding

API keys

This can be set as env variables or passed as params to litellm.embedding()

import os
os.environ["AWS_ACCESS_KEY_ID"] = ""  # Access key
os.environ["AWS_SECRET_ACCESS_KEY"] = "" # Secret access key
os.environ["AWS_REGION_NAME"] = "" # us-east-1, us-east-2, us-west-1, us-west-2

Usage

from litellm import embedding
response = embedding(
    model="amazon.titan-embed-text-v1",
    input=["good morning from litellm"],
)
print(response)
Model Name Function Call
Titan Embeddings - G1 embedding(model="amazon.titan-embed-text-v1", input=input)
Cohere Embeddings - English embedding(model="cohere.embed-english-v3", input=input)
Cohere Embeddings - Multilingual embedding(model="cohere.embed-multilingual-v3", input=input)

Cohere Embedding Models

https://docs.cohere.com/reference/embed

Usage

from litellm import embedding
os.environ["COHERE_API_KEY"] = "cohere key"

# cohere call
response = embedding(
    model="embed-english-v3.0", 
    input=["good morning from litellm", "this is another item"], 
    input_type="search_document" # optional param for v3 llms
)
Model Name Function Call
embed-english-v3.0 embedding(model="embed-english-v3.0", input=["good morning from litellm", "this is another item"])
embed-english-light-v3.0 embedding(model="embed-english-light-v3.0", input=["good morning from litellm", "this is another item"])
embed-multilingual-v3.0 embedding(model="embed-multilingual-v3.0", input=["good morning from litellm", "this is another item"])
embed-multilingual-light-v3.0 embedding(model="embed-multilingual-light-v3.0", input=["good morning from litellm", "this is another item"])
embed-english-v2.0 embedding(model="embed-english-v2.0", input=["good morning from litellm", "this is another item"])
embed-english-light-v2.0 embedding(model="embed-english-light-v2.0", input=["good morning from litellm", "this is another item"])
embed-multilingual-v2.0 embedding(model="embed-multilingual-v2.0", input=["good morning from litellm", "this is another item"])

HuggingFace Embedding Models

LiteLLM supports all Feature-Extraction + Sentence Similarity Embedding models: https://huggingface.co/models?pipeline_tag=feature-extraction

Usage

from litellm import embedding
import os
os.environ['HUGGINGFACE_API_KEY'] = ""
response = embedding(
    model='huggingface/microsoft/codebert-base', 
    input=["good morning from litellm"]
)

Usage - Set input_type

LiteLLM infers input type (feature-extraction or sentence-similarity) by making a GET request to the api base.

Override this, by setting the input_type yourself.

from litellm import embedding
import os
os.environ['HUGGINGFACE_API_KEY'] = ""
response = embedding(
    model='huggingface/microsoft/codebert-base', 
    input=["good morning from litellm", "you are a good bot"],
    api_base = "https://p69xlsj6rpno5drq.us-east-1.aws.endpoints.huggingface.cloud", 
    input_type="sentence-similarity"
)

Usage - Custom API Base

from litellm import embedding
import os
os.environ['HUGGINGFACE_API_KEY'] = ""
response = embedding(
    model='huggingface/microsoft/codebert-base', 
    input=["good morning from litellm"],
    api_base = "https://p69xlsj6rpno5drq.us-east-1.aws.endpoints.huggingface.cloud"
)
Model Name Function Call Required OS Variables
microsoft/codebert-base embedding('huggingface/microsoft/codebert-base', input=input) os.environ['HUGGINGFACE_API_KEY']
BAAI/bge-large-zh embedding('huggingface/BAAI/bge-large-zh', input=input) os.environ['HUGGINGFACE_API_KEY']
any-hf-embedding-model embedding('huggingface/hf-embedding-model', input=input) os.environ['HUGGINGFACE_API_KEY']

Mistral AI Embedding Models

All models listed here https://docs.mistral.ai/platform/endpoints are supported

Usage

from litellm import embedding
import os

os.environ['MISTRAL_API_KEY'] = ""
response = embedding(
    model="mistral/mistral-embed",
    input=["good morning from litellm"],
)
print(response)
Model Name Function Call
mistral-embed embedding(model="mistral/mistral-embed", input)

Vertex AI Embedding Models

Usage - Embedding

import litellm
from litellm import embedding
litellm.vertex_project = "hardy-device-38811" # Your Project ID
litellm.vertex_location = "us-central1"  # proj location

response = embedding(
    model="vertex_ai/textembedding-gecko",
    input=["good morning from litellm"],
)
print(response)

Supported Models

All models listed here are supported

Model Name Function Call
textembedding-gecko embedding(model="vertex_ai/textembedding-gecko", input)
textembedding-gecko-multilingual embedding(model="vertex_ai/textembedding-gecko-multilingual", input)
textembedding-gecko-multilingual@001 embedding(model="vertex_ai/textembedding-gecko-multilingual@001", input)
textembedding-gecko@001 embedding(model="vertex_ai/textembedding-gecko@001", input)
textembedding-gecko@003 embedding(model="vertex_ai/textembedding-gecko@003", input)
text-embedding-preview-0409 embedding(model="vertex_ai/text-embedding-preview-0409", input)
text-multilingual-embedding-preview-0409 embedding(model="vertex_ai/text-multilingual-embedding-preview-0409", input)

Voyage AI Embedding Models

Usage - Embedding

from litellm import embedding
import os

os.environ['VOYAGE_API_KEY'] = ""
response = embedding(
    model="voyage/voyage-01",
    input=["good morning from litellm"],
)
print(response)

Supported Models

All models listed here https://docs.voyageai.com/embeddings/#models-and-specifics are supported

Model Name Function Call
voyage-01 embedding(model="voyage/voyage-01", input)
voyage-lite-01 embedding(model="voyage/voyage-lite-01", input)
voyage-lite-01-instruct embedding(model="voyage/voyage-lite-01-instruct", input)

Provider-specific Params

:::info

Any non-openai params, will be treated as provider-specific params, and sent in the request body as kwargs to the provider.

See Reserved Params :::

Example

Cohere v3 Models have a required parameter: input_type, it can be one of the following four values:

  • input_type="search_document": (default) Use this for texts (documents) you want to store in your vector database
  • input_type="search_query": Use this for search queries to find the most relevant documents in your vector database
  • input_type="classification": Use this if you use the embeddings as an input for a classification system
  • input_type="clustering": Use this if you use the embeddings for text clustering

https://txt.cohere.com/introducing-embed-v3/

from litellm import embedding
os.environ["COHERE_API_KEY"] = "cohere key"

# cohere call
response = embedding(
    model="embed-english-v3.0", 
    input=["good morning from litellm", "this is another item"], 
    input_type="search_document" # 👈 PROVIDER-SPECIFIC PARAM
)

via config

model_list:
  - model_name: "cohere-embed"
    litellm_params:
      model: embed-english-v3.0
      input_type: search_document # 👈 PROVIDER-SPECIFIC PARAM

via request

curl -X POST 'http://0.0.0.0:4000/v1/embeddings' \
-H 'Authorization: Bearer sk-54d77cd67b9febbb' \
-H 'Content-Type: application/json' \
-d '{
  "model": "cohere-embed",
  "input": ["Are you authorized to work in United States of America?"],
  "input_type": "search_document" # 👈 PROVIDER-SPECIFIC PARAM
}'