LiteLLM Minor Fixes & Improvements (10/24/2024) (#6421 )

* fix(utils.py): support passing dynamic api base to validate_environment

Returns True if just api base is required and api base is passed

* fix(litellm_pre_call_utils.py): feature flag sending client headers to llm api

Fixes https://github.com/BerriAI/litellm/issues/6410

* fix(anthropic/chat/transformation.py): return correct error message

* fix(http_handler.py): add error response text in places where we expect it

* fix(factory.py): handle base case of no non-system messages to bedrock

Fixes https://github.com/BerriAI/litellm/issues/6411

* feat(cohere/embed): Support cohere image embeddings

Closes https://github.com/BerriAI/litellm/issues/6413

* fix(__init__.py): fix linting error

* docs(supported_embedding.md): add image embedding example to docs

* feat(cohere/embed): use cohere embedding returned usage for cost calc

* build(model_prices_and_context_window.json): add embed-english-v3.0 details (image cost + 'supports_image_input' flag)

* fix(cohere_transformation.py): fix linting error

* test(test_proxy_server.py): cleanup test

* test: cleanup test

* fix: fix linting errors

2024-10-25 15:55:56 -07:00

16 KiB

Raw Blame History

import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem';

Embedding Models

Quick Start

from litellm import embedding
import os
os.environ['OPENAI_API_KEY'] = ""
response = embedding(model='text-embedding-ada-002', input=["good morning from litellm"])

Proxy Usage

NOTE For vertex_ai,

export GOOGLE_APPLICATION_CREDENTIALS="absolute/path/to/service_account.json"

Add model to config

model_list:
- model_name: textembedding-gecko
  litellm_params:
    model: vertex_ai/textembedding-gecko

general_settings:
  master_key: sk-1234

Start proxy

litellm --config /path/to/config.yaml 

# RUNNING on http://0.0.0.0:4000

Test

curl --location 'http://0.0.0.0:4000/embeddings' \
--header 'Authorization: Bearer sk-1234' \
--header 'Content-Type: application/json' \
--data '{"input": ["Academia.edu uses"], "model": "textembedding-gecko", "encoding_format": "base64"}'

from openai import OpenAI
client = OpenAI(
  api_key="sk-1234",
  base_url="http://0.0.0.0:4000"
)

client.embeddings.create(
  model="textembedding-gecko",
  input="The food was delicious and the waiter...",
  encoding_format="float"
)

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="textembedding-gecko", openai_api_base="http://0.0.0.0:4000", openai_api_key="sk-1234")

text = "This is a test document."

query_result = embeddings.embed_query(text)

print(f"VERTEX AI EMBEDDINGS")
print(query_result[:5])

Image Embeddings

For models that support image embeddings, you can pass in a base64 encoded image string to the input param.

from litellm import embedding
import os

# set your api key
os.environ["COHERE_API_KEY"] = ""

response = embedding(model="cohere/embed-english-v3.0", input=["<base64 encoded image>"])

Setup config.yaml

model_list:
  - model_name: cohere-embed
    litellm_params:
      model: cohere/embed-english-v3.0
      api_key: os.environ/COHERE_API_KEY

Start proxy

litellm --config /path/to/config.yaml 

# RUNNING on http://0.0.0.0:4000

Test it!

curl -X POST 'http://0.0.0.0:4000/v1/embeddings' \
-H 'Authorization: Bearer sk-54d77cd67b9febbb' \
-H 'Content-Type: application/json' \
-d '{
  "model": "cohere/embed-english-v3.0",
  "input": ["<base64 encoded image>"]
}'

Input Params for `litellm.embedding()`

:::info

Any non-openai params, will be treated as provider-specific params, and sent in the request body as kwargs to the provider.

See Reserved Params

See Example :::

Required Fields

model: string - ID of the model to use. model='text-embedding-ada-002'
input: string or array - Input text to embed, encoded as a string or array of tokens. To embed multiple inputs in a single request, pass an array of strings or array of token arrays. The input must not exceed the max input tokens for the model (8192 tokens for text-embedding-ada-002), cannot be an empty string, and any array must be 2048 dimensions or less.

input=["good morning from litellm"]

Optional LiteLLM Fields

user: string (optional) A unique identifier representing your end-user,
dimensions: integer (Optional) The number of dimensions the resulting output embeddings should have. Only supported in OpenAI/Azure text-embedding-3 and later models.
encoding_format: string (Optional) The format to return the embeddings in. Can be either "float" or "base64". Defaults to encoding_format="float"
timeout: integer (Optional) - The maximum time, in seconds, to wait for the API to respond. Defaults to 600 seconds (10 minutes).
api_base: string (optional) - The api endpoint you want to call the model with
api_version: string (optional) - (Azure-specific) the api version for the call
api_key: string (optional) - The API key to authenticate and authorize requests. If not provided, the default API key is used.
api_type: string (optional) - The type of API to use.

Output from `litellm.embedding()`

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [
        -0.0022326677571982145,
        0.010749882087111473,
        ...
        ...
        ...
   
      ]
    }
  ],
  "model": "text-embedding-ada-002-v2",
  "usage": {
    "prompt_tokens": 10,
    "total_tokens": 10
  }
}

OpenAI Embedding Models

Usage

from litellm import embedding
import os
os.environ['OPENAI_API_KEY'] = ""
response = embedding(
    model="text-embedding-3-small",
    input=["good morning from litellm", "this is another item"],
    metadata={"anything": "good day"},
    dimensions=5 # Only supported in text-embedding-3 and later models.
)

Model Name	Function Call	Required OS Variables
text-embedding-3-small	`embedding('text-embedding-3-small', input)`	`os.environ['OPENAI_API_KEY']`
text-embedding-3-large	`embedding('text-embedding-3-large', input)`	`os.environ['OPENAI_API_KEY']`
text-embedding-ada-002	`embedding('text-embedding-ada-002', input)`	`os.environ['OPENAI_API_KEY']`

Azure OpenAI Embedding Models

API keys

This can be set as env variables or passed as params to litellm.embedding()

import os
os.environ['AZURE_API_KEY'] = 
os.environ['AZURE_API_BASE'] = 
os.environ['AZURE_API_VERSION'] =

Usage

from litellm import embedding
response = embedding(
    model="azure/<your deployment name>",
    input=["good morning from litellm"],
    api_key=api_key,
    api_base=api_base,
    api_version=api_version,
)
print(response)

Model Name	Function Call
text-embedding-ada-002	`embedding(model="azure/<your deployment name>", input=input)`

h/t to Mikko for this integration

OpenAI Compatible Embedding Models

Use this for calling /embedding endpoints on OpenAI Compatible Servers, example https://github.com/xorbitsai/inference

Note add openai/ prefix to model so litellm knows to route to OpenAI

Usage

from litellm import embedding
response = embedding(
  model = "openai/<your-llm-name>",     # add `openai/` prefix to model so litellm knows to route to OpenAI
  api_base="http://0.0.0.0:4000/"       # set API Base of your Custom OpenAI Endpoint
  input=["good morning from litellm"]
)

Bedrock Embedding

API keys

This can be set as env variables or passed as params to litellm.embedding()

import os
os.environ["AWS_ACCESS_KEY_ID"] = ""  # Access key
os.environ["AWS_SECRET_ACCESS_KEY"] = "" # Secret access key
os.environ["AWS_REGION_NAME"] = "" # us-east-1, us-east-2, us-west-1, us-west-2

Usage

from litellm import embedding
response = embedding(
    model="amazon.titan-embed-text-v1",
    input=["good morning from litellm"],
)
print(response)

Model Name	Function Call
Titan Embeddings - G1	`embedding(model="amazon.titan-embed-text-v1", input=input)`
Cohere Embeddings - English	`embedding(model="cohere.embed-english-v3", input=input)`
Cohere Embeddings - Multilingual	`embedding(model="cohere.embed-multilingual-v3", input=input)`

Cohere Embedding Models

https://docs.cohere.com/reference/embed

Usage

from litellm import embedding
os.environ["COHERE_API_KEY"] = "cohere key"

# cohere call
response = embedding(
    model="embed-english-v3.0", 
    input=["good morning from litellm", "this is another item"], 
    input_type="search_document" # optional param for v3 llms
)

Model Name	Function Call
embed-english-v3.0	`embedding(model="embed-english-v3.0", input=["good morning from litellm", "this is another item"])`
embed-english-light-v3.0	`embedding(model="embed-english-light-v3.0", input=["good morning from litellm", "this is another item"])`
embed-multilingual-v3.0	`embedding(model="embed-multilingual-v3.0", input=["good morning from litellm", "this is another item"])`
embed-multilingual-light-v3.0	`embedding(model="embed-multilingual-light-v3.0", input=["good morning from litellm", "this is another item"])`
embed-english-v2.0	`embedding(model="embed-english-v2.0", input=["good morning from litellm", "this is another item"])`
embed-english-light-v2.0	`embedding(model="embed-english-light-v2.0", input=["good morning from litellm", "this is another item"])`
embed-multilingual-v2.0	`embedding(model="embed-multilingual-v2.0", input=["good morning from litellm", "this is another item"])`

HuggingFace Embedding Models

LiteLLM supports all Feature-Extraction + Sentence Similarity Embedding models: https://huggingface.co/models?pipeline_tag=feature-extraction

Usage

from litellm import embedding
import os
os.environ['HUGGINGFACE_API_KEY'] = ""
response = embedding(
    model='huggingface/microsoft/codebert-base', 
    input=["good morning from litellm"]
)

Usage - Set input_type

LiteLLM infers input type (feature-extraction or sentence-similarity) by making a GET request to the api base.

Override this, by setting the input_type yourself.

from litellm import embedding
import os
os.environ['HUGGINGFACE_API_KEY'] = ""
response = embedding(
    model='huggingface/microsoft/codebert-base', 
    input=["good morning from litellm", "you are a good bot"],
    api_base = "https://p69xlsj6rpno5drq.us-east-1.aws.endpoints.huggingface.cloud", 
    input_type="sentence-similarity"
)

Usage - Custom API Base

from litellm import embedding
import os
os.environ['HUGGINGFACE_API_KEY'] = ""
response = embedding(
    model='huggingface/microsoft/codebert-base', 
    input=["good morning from litellm"],
    api_base = "https://p69xlsj6rpno5drq.us-east-1.aws.endpoints.huggingface.cloud"
)

Model Name	Function Call	Required OS Variables
microsoft/codebert-base	`embedding('huggingface/microsoft/codebert-base', input=input)`	`os.environ['HUGGINGFACE_API_KEY']`
BAAI/bge-large-zh	`embedding('huggingface/BAAI/bge-large-zh', input=input)`	`os.environ['HUGGINGFACE_API_KEY']`
any-hf-embedding-model	`embedding('huggingface/hf-embedding-model', input=input)`	`os.environ['HUGGINGFACE_API_KEY']`

Mistral AI Embedding Models

All models listed here https://docs.mistral.ai/platform/endpoints are supported

Usage

from litellm import embedding
import os

os.environ['MISTRAL_API_KEY'] = ""
response = embedding(
    model="mistral/mistral-embed",
    input=["good morning from litellm"],
)
print(response)

Model Name	Function Call
mistral-embed	`embedding(model="mistral/mistral-embed", input)`

Vertex AI Embedding Models

Usage - Embedding

import litellm
from litellm import embedding
litellm.vertex_project = "hardy-device-38811" # Your Project ID
litellm.vertex_location = "us-central1"  # proj location

response = embedding(
    model="vertex_ai/textembedding-gecko",
    input=["good morning from litellm"],
)
print(response)

Supported Models

All models listed here are supported

Model Name	Function Call
textembedding-gecko	`embedding(model="vertex_ai/textembedding-gecko", input)`
textembedding-gecko-multilingual	`embedding(model="vertex_ai/textembedding-gecko-multilingual", input)`
textembedding-gecko-multilingual@001	`embedding(model="vertex_ai/textembedding-gecko-multilingual@001", input)`
textembedding-gecko@001	`embedding(model="vertex_ai/textembedding-gecko@001", input)`
textembedding-gecko@003	`embedding(model="vertex_ai/textembedding-gecko@003", input)`
text-embedding-preview-0409	`embedding(model="vertex_ai/text-embedding-preview-0409", input)`
text-multilingual-embedding-preview-0409	`embedding(model="vertex_ai/text-multilingual-embedding-preview-0409", input)`

Voyage AI Embedding Models

Usage - Embedding

from litellm import embedding
import os

os.environ['VOYAGE_API_KEY'] = ""
response = embedding(
    model="voyage/voyage-01",
    input=["good morning from litellm"],
)
print(response)

Supported Models

All models listed here https://docs.voyageai.com/embeddings/#models-and-specifics are supported

Model Name	Function Call
voyage-01	`embedding(model="voyage/voyage-01", input)`
voyage-lite-01	`embedding(model="voyage/voyage-lite-01", input)`
voyage-lite-01-instruct	`embedding(model="voyage/voyage-lite-01-instruct", input)`

Provider-specific Params

:::info

Any non-openai params, will be treated as provider-specific params, and sent in the request body as kwargs to the provider.

See Reserved Params :::

Example

Cohere v3 Models have a required parameter: input_type, it can be one of the following four values:

input_type="search_document": (default) Use this for texts (documents) you want to store in your vector database
input_type="search_query": Use this for search queries to find the most relevant documents in your vector database
input_type="classification": Use this if you use the embeddings as an input for a classification system
input_type="clustering": Use this if you use the embeddings for text clustering

https://txt.cohere.com/introducing-embed-v3/

from litellm import embedding
os.environ["COHERE_API_KEY"] = "cohere key"

# cohere call
response = embedding(
    model="embed-english-v3.0", 
    input=["good morning from litellm", "this is another item"], 
    input_type="search_document" # 👈 PROVIDER-SPECIFIC PARAM
)

via config

model_list:
  - model_name: "cohere-embed"
    litellm_params:
      model: embed-english-v3.0
      input_type: search_document # 👈 PROVIDER-SPECIFIC PARAM

via request

curl -X POST 'http://0.0.0.0:4000/v1/embeddings' \
-H 'Authorization: Bearer sk-54d77cd67b9febbb' \
-H 'Content-Type: application/json' \
-d '{
  "model": "cohere-embed",
  "input": ["Are you authorized to work in United States of America?"],
  "input_type": "search_document" # 👈 PROVIDER-SPECIFIC PARAM
}'

16 KiB Raw Blame History

Embedding Models

Quick Start

Proxy Usage

Add model to config

Start proxy

Test

Image Embeddings

Input Params for litellm.embedding()

Required Fields

Optional LiteLLM Fields

Output from litellm.embedding()

OpenAI Embedding Models

Usage

Azure OpenAI Embedding Models

API keys

Usage

OpenAI Compatible Embedding Models

Usage

Bedrock Embedding

API keys

Usage

Cohere Embedding Models

Usage

HuggingFace Embedding Models

Usage

Usage - Set input_type

Usage - Custom API Base

Mistral AI Embedding Models

Usage

Vertex AI Embedding Models

Usage - Embedding

Supported Models

Voyage AI Embedding Models

Usage - Embedding

Supported Models

Provider-specific Params

Example

16 KiB

Raw Blame History

Input Params for `litellm.embedding()`

Output from `litellm.embedding()`