phoenix/litellm

Fork 0

forked from phoenix/litellm-mirror

Krrish Dholakia 6f1eb038bc docs(bedrock.md): adding docs for calling bedrock models on proxy via config.yaml

2024-03-14 14:50:13 -07:00

18 KiB

Raw Blame History

import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem';

AWS Bedrock

Anthropic, Amazon Titan, A121 LLMs are Supported on Bedrock

LiteLLM requires boto3 to be installed on your system for Bedrock requests

pip install boto3>=1.28.57

Required Environment Variables

os.environ["AWS_ACCESS_KEY_ID"] = ""  # Access key
os.environ["AWS_SECRET_ACCESS_KEY"] = "" # Secret access key
os.environ["AWS_REGION_NAME"] = "" # us-east-1, us-east-2, us-west-1, us-west-2

Usage

import os
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = completion(
  model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
  messages=[{ "content": "Hello, how are you?","role": "user"}]
)

OpenAI Proxy Usage

Here's how to call Anthropic with the LiteLLM Proxy Server

1. Save key in your environment

export AWS_ACCESS_KEY_ID=""
export AWS_SECRET_ACCESS_KEY=""
export AWS_REGION_NAME=""

2. Start the proxy

$ litellm --model anthropic.claude-3-sonnet-20240229-v1:0

# Server running on http://0.0.0.0:4000

model_list:
  - model_name: bedrock-claude-v1
    litellm_params:
      model: bedrock/anthropic.claude-instant-v1

3. Test it

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
      "model": "bedrock-claude-v1",
      "messages": [
        {
          "role": "user",
          "content": "what llm are you"
        }
      ]
    }
'

import openai
client = openai.OpenAI(
    api_key="anything",
    base_url="http://0.0.0.0:4000"
)

# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="bedrock-claude-v1", messages = [
    {
        "role": "user",
        "content": "this is a test request, write a short poem"
    }
])

print(response)

from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
)
from langchain.schema import HumanMessage, SystemMessage

chat = ChatOpenAI(
    openai_api_base="http://0.0.0.0:4000", # set openai_api_base to the LiteLLM Proxy
    model = "bedrock-claude-v1",
    temperature=0.1
)

messages = [
    SystemMessage(
        content="You are a helpful assistant that im using to make a test request to."
    ),
    HumanMessage(
        content="test from litellm. tell me why it's amazing in 1 sentence"
    ),
]
response = chat(messages)

print(response)

Usage - Function Calling

from litellm import completion

# set env
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather in a given location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
                },
                "required": ["location"],
            },
        },
    }
]
messages = [{"role": "user", "content": "What's the weather like in Boston today?"}]

response = completion(
    model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
    messages=messages,
    tools=tools,
    tool_choice="auto",
)
# Add any assertions, here to check response args
print(response)
assert isinstance(response.choices[0].message.tool_calls[0].function.name, str)
assert isinstance(
    response.choices[0].message.tool_calls[0].function.arguments, str
)

Usage - Vision

from litellm import completion

# set env
os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""


def encode_image(image_path):
    import base64

    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")


image_path = "../proxy/cached_logo.jpg"
# Getting the base64 string
base64_image = encode_image(image_path)
resp = litellm.completion(
    model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Whats in this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "data:image/jpeg;base64," + base64_image
                    },
                },
            ],
        }
    ],
)
print(f"\nResponse: {resp}")

Usage - "Assistant Pre-fill"

If you're using Anthropic's Claude with Bedrock, you can "put words in Claude's mouth" by including an assistant role message as the last item in the messages array.

Important

The returned completion will not include your "pre-fill" text, since it is part of the prompt itself. Make sure to prefix Claude's completion with your pre-fill.

import os
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

messages = [
    {"role": "user", "content": "How do you say 'Hello' in German? Return your answer as a JSON object, like this:\n\n{ \"Hello\": \"Hallo\" }"},
    {"role": "assistant", "content": "{"},
]
response = completion(model="bedrock/anthropic.claude-v2", messages=messages)

Example prompt sent to Claude


Human: How do you say 'Hello' in German? Return your answer as a JSON object, like this:

{ "Hello": "Hallo" }

Assistant: {

Usage - "System" messages

If you're using Anthropic's Claude 2.1 with Bedrock, system role messages are properly formatted for you.

import os
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

messages = [
    {"role": "system", "content": "You are a snarky assistant."},
    {"role": "user", "content": "How do I boil water?"},
]
response = completion(model="bedrock/anthropic.claude-v2:1", messages=messages)

Example prompt sent to Claude

You are a snarky assistant.

Human: How do I boil water?

Assistant:

Usage - Streaming

import os
from litellm import completion

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = completion(
  model="bedrock/anthropic.claude-instant-v1",
  messages=[{ "content": "Hello, how are you?","role": "user"}],
  stream=True
)
for chunk in response:
  print(chunk)

Example Streaming Output Chunk

{
  "choices": [
    {
      "finish_reason": null,
      "index": 0,
      "delta": {
        "content": "ase can appeal the case to a higher federal court. If a higher federal court rules in a way that conflicts with a ruling from a lower federal court or conflicts with a ruling from a higher state court, the parties involved in the case can appeal the case to the Supreme Court. In order to appeal a case to the Sup"
      }
    }
  ],
  "created": null,
  "model": "anthropic.claude-instant-v1",
  "usage": {
    "prompt_tokens": null,
    "completion_tokens": null,
    "total_tokens": null
  }
}

Boto3 - Authentication

Passing credentials as parameters - Completion()

Pass AWS credentials as parameters to litellm.completion

import os
from litellm import completion

response = completion(
            model="bedrock/anthropic.claude-instant-v1",
            messages=[{ "content": "Hello, how are you?","role": "user"}],
            aws_access_key_id="",
            aws_secret_access_key="",
            aws_region_name="",
)

Passing an external BedrockRuntime.Client as a parameter - Completion()

Pass an external BedrockRuntime.Client object as a parameter to litellm.completion. Useful when using an AWS credentials profile, SSO session, assumed role session, or if environment variables are not available for auth.

Create a client from session credentials:

import boto3
from litellm import completion

bedrock = boto3.client(
            service_name="bedrock-runtime",
            region_name="us-east-1",
            aws_access_key_id="",
            aws_secret_access_key="",
            aws_session_token="",
)

response = completion(
            model="bedrock/anthropic.claude-instant-v1",
            messages=[{ "content": "Hello, how are you?","role": "user"}],
            aws_bedrock_client=bedrock,
)

Create a client from AWS profile in ~/.aws/config:

import boto3
from litellm import completion

dev_session = boto3.Session(profile_name="dev-profile")
bedrock = dev_session.client(
            service_name="bedrock-runtime",
            region_name="us-east-1",
)

response = completion(
            model="bedrock/anthropic.claude-instant-v1",
            messages=[{ "content": "Hello, how are you?","role": "user"}],
            aws_bedrock_client=bedrock,
)

Set AWS_PROFILE environment variable
Make bedrock completion call

import os
from litellm import completion

response = completion(
            model="bedrock/anthropic.claude-instant-v1",
            messages=[{ "content": "Hello, how are you?","role": "user"}]
)

or pass aws_profile_name:

import os
from litellm import completion

response = completion(
            model="bedrock/anthropic.claude-instant-v1",
            messages=[{ "content": "Hello, how are you?","role": "user"}],
            aws_profile_name="dev-profile",
)

STS based Auth

Set aws_role_name and aws_session_name in completion() / embedding() function

Make the bedrock completion call

from litellm import completion

response = completion(
            model="bedrock/anthropic.claude-instant-v1",
            messages=messages,
            max_tokens=10,
            temperature=0.1,
            aws_role_name=aws_role_name,
            aws_session_name="my-test-session",
        )

If you also need to dynamically set the aws user accessing the role, add the additional args in the completion()/embedding() function

from litellm import completion

response = completion(
            model="bedrock/anthropic.claude-instant-v1",
            messages=messages,
            max_tokens=10,
            temperature=0.1,
            aws_region_name=aws_region_name,
            aws_access_key_id=aws_access_key_id,
            aws_secret_access_key=aws_secret_access_key,
            aws_role_name=aws_role_name,
            aws_session_name="my-test-session",
        )

Provisioned throughput models

To use provisioned throughput Bedrock models pass

model=bedrock/<base-model>, example model=bedrock/anthropic.claude-v2. Set model to any of the Supported AWS models
model_id=provisioned-model-arn

Completion

import litellm
response = litellm.completion(
    model="bedrock/anthropic.claude-instant-v1",
    model_id="provisioned-model-arn",
    messages=[{"content": "Hello, how are you?", "role": "user"}]
)

Embedding

import litellm
response = litellm.embedding(
    model="bedrock/amazon.titan-embed-text-v1",
    model_id="provisioned-model-arn",
    input=["hi"],
)

Supported AWS Bedrock Models

Here's an example of using a bedrock model with LiteLLM

Model Name	Command
Anthropic Claude-V3 sonnet	`completion(model='bedrock/anthropic.claude-3-sonnet-20240229-v1:0', messages=messages)`
Anthropic Claude-V3 Haiku	`completion(model='bedrock/anthropic.claude-3-haiku-20240307-v1:0', messages=messages)`
Anthropic Claude-V2.1	`completion(model='bedrock/anthropic.claude-v2:1', messages=messages)`
Anthropic Claude-V2	`completion(model='bedrock/anthropic.claude-v2', messages=messages)`
Anthropic Claude-Instant V1	`completion(model='bedrock/anthropic.claude-instant-v1', messages=messages)`
Amazon Titan Lite	`completion(model='bedrock/amazon.titan-text-lite-v1', messages=messages)`
Amazon Titan Express	`completion(model='bedrock/amazon.titan-text-express-v1', messages=messages)`
Cohere Command	`completion(model='bedrock/cohere.command-text-v14', messages=messages)`
AI21 J2-Mid	`completion(model='bedrock/ai21.j2-mid-v1', messages=messages)`
AI21 J2-Ultra	`completion(model='bedrock/ai21.j2-ultra-v1', messages=messages)`
Meta Llama 2 Chat 13b	`completion(model='bedrock/meta.llama2-13b-chat-v1', messages=messages)`
Meta Llama 2 Chat 70b	`completion(model='bedrock/meta.llama2-70b-chat-v1', messages=messages)`
Mistral 7B Instruct	`completion(model='bedrock/mistral.mistral-7b-instruct-v0:2', messages=messages)`
Mixtral 8x7B Instruct	`completion(model='bedrock/mistral.mixtral-8x7b-instruct-v0:1', messages=messages)`

Bedrock Embedding

API keys

This can be set as env variables or passed as params to litellm.embedding()

import os
os.environ["AWS_ACCESS_KEY_ID"] = ""        # Access key
os.environ["AWS_SECRET_ACCESS_KEY"] = ""    # Secret access key
os.environ["AWS_REGION_NAME"] = ""           # us-east-1, us-east-2, us-west-1, us-west-2

Usage

from litellm import embedding
response = embedding(
    model="bedrock/amazon.titan-embed-text-v1",
    input=["good morning from litellm"],
)
print(response)

Supported AWS Bedrock Embedding Models

Model Name	Function Call
Titan Embeddings - G1	`embedding(model="bedrock/amazon.titan-embed-text-v1", input=input)`
Cohere Embeddings - English	`embedding(model="bedrock/cohere.embed-english-v3", input=input)`
Cohere Embeddings - Multilingual	`embedding(model="bedrock/cohere.embed-multilingual-v3", input=input)`

Image Generation

Use this for stable diffusion on bedrock

Usage

import os
from litellm import image_generation

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = image_generation(
            prompt="A cute baby sea otter",
            model="bedrock/stability.stable-diffusion-xl-v0",
        )
print(f"response: {response}")

Set optional params

import os
from litellm import image_generation

os.environ["AWS_ACCESS_KEY_ID"] = ""
os.environ["AWS_SECRET_ACCESS_KEY"] = ""
os.environ["AWS_REGION_NAME"] = ""

response = image_generation(
            prompt="A cute baby sea otter",
            model="bedrock/stability.stable-diffusion-xl-v0",
            ### OPENAI-COMPATIBLE ###
            size="128x512", # width=128, height=512
            ### PROVIDER-SPECIFIC ### see `AmazonStabilityConfig` in bedrock.py for all params
            seed=30
        )
print(f"response: {response}")

Supported AWS Bedrock Image Generation Models

Model Name	Function Call
Stable Diffusion - v0	`embedding(model="bedrock/stability.stable-diffusion-xl-v0", prompt=prompt)`
Stable Diffusion - v0	`embedding(model="bedrock/stability.stable-diffusion-xl-v1", prompt=prompt)`

18 KiB Raw Blame History

AWS Bedrock

Required Environment Variables

Usage

OpenAI Proxy Usage

1. Save key in your environment

2. Start the proxy

3. Test it

Usage - Function Calling

Usage - Vision

Usage - "Assistant Pre-fill"

Example prompt sent to Claude

Usage - "System" messages

Example prompt sent to Claude

Usage - Streaming

Example Streaming Output Chunk

Boto3 - Authentication

Passing credentials as parameters - Completion()

Passing an external BedrockRuntime.Client as a parameter - Completion()

SSO Login (AWS Profile)

STS based Auth

Provisioned throughput models

Supported AWS Bedrock Models

Bedrock Embedding

API keys

Usage

Supported AWS Bedrock Embedding Models

Image Generation

Usage

Supported AWS Bedrock Image Generation Models

18 KiB

Raw Blame History