docs: add docs on bedrock + cohere pass through endpoints

This commit is contained in:
Krrish Dholakia 2024-08-19 14:21:56 -07:00
parent 417547b6f9
commit a494b5b2f3
3 changed files with 492 additions and 1 deletions

View file

@ -0,0 +1,236 @@
# Bedrock (Pass-Through)
Pass-through endpoints for Bedrock - call provider-specific endpoint, in native format (no translation).
Just replace `https://bedrock-runtime.{aws_region_name}.amazonaws.com` with `LITELLM_PROXY_BASE_URL/bedrock` 🚀
#### **Example Usage**
```bash
curl -X POST 'http://0.0.0.0:4000/bedrock/model/cohere.command-r-v1:0/converse' \
-H 'Authorization: Bearer anything' \
-H 'Content-Type: application/json' \
-d '{
"messages": [
{"role": "user",
"content": [{"text": "Hello"}]
}
]
}'
```
Supports **ALL** Bedrock Endpoints (including streaming).
[**See All Bedrock Endpoints**](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html)
## Quick Start
Let's call the Bedrock [`/converse` endpoint](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html)
1. Add AWS Keyss to your environment
```bash
export AWS_ACCESS_KEY_ID="" # Access key
export AWS_SECRET_ACCESS_KEY="" # Secret access key
export AWS_REGION_NAME="" # us-east-1, us-east-2, us-west-1, us-west-2
```
2. Start LiteLLM Proxy
```bash
litellm
# RUNNING on http://0.0.0.0:4000
```
3. Test it!
Let's call the Bedrock converse endpoint
```bash
curl -X POST 'http://0.0.0.0:4000/bedrock/model/cohere.command-r-v1:0/converse' \
-H 'Authorization: Bearer anything' \
-H 'Content-Type: application/json' \
-d '{
"messages": [
{"role": "user",
"content": [{"text": "Hello"}]
}
]
}'
```
## Examples
Anything after `http://0.0.0.0:4000/bedrock` is treated as a provider-specific route, and handled accordingly.
Key Changes:
| **Original Endpoint** | **Replace With** |
|------------------------------------------------------|-----------------------------------|
| `https://bedrock-runtime.{aws_region_name}.amazonaws.com` | `http://0.0.0.0:4000/bedrock` (LITELLM_PROXY_BASE_URL="http://0.0.0.0:4000") |
| `AWS4-HMAC-SHA256..` | `Bearer anything` (use `Bearer LITELLM_VIRTUAL_KEY` if Virtual Keys are setup on proxy) |
### **Example 1: Converse API**
#### LiteLLM Proxy Call
```bash
curl -X POST 'http://0.0.0.0:4000/bedrock/model/cohere.command-r-v1:0/converse' \
-H 'Authorization: Bearer sk-anything' \
-H 'Content-Type: application/json' \
-d '{
"messages": [
{"role": "user",
"content": [{"text": "Hello"}]
}
]
}'
```
#### Direct Bedrock API Call
```bash
curl -X POST 'https://bedrock-runtime.us-west-2.amazonaws.com/model/cohere.command-r-v1:0/converse' \
-H 'Authorization: AWS4-HMAC-SHA256..' \
-H 'Content-Type: application/json' \
-d '{
"messages": [
{"role": "user",
"content": [{"text": "Hello"}]
}
]
}'
```
### **Example 2: Apply Guardrail**
#### LiteLLM Proxy Call
```bash
curl "http://0.0.0.0:4000/bedrock/guardrail/guardrailIdentifier/version/guardrailVersion/apply" \
-H 'Authorization: Bearer sk-anything' \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{"text": {"text": "Hello world"}}],
"source": "INPUT"
}'
```
#### Direct Bedrock API Call
```bash
curl "https://bedrock-runtime.us-west-2.amazonaws.com/guardrail/guardrailIdentifier/version/guardrailVersion/apply" \
-H 'Authorization: AWS4-HMAC-SHA256..' \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{"text": {"text": "Hello world"}}],
"source": "INPUT"
}'
```
### **Example 3: Query Knowledge Base**
```bash
curl -X POST "http://0.0.0.0:4000/bedrock/knowledgebases/{knowledgeBaseId}/retrieve" \
-H 'Authorization: Bearer sk-anything' \
-H 'Content-Type: application/json' \
-d '{
"nextToken": "string",
"retrievalConfiguration": {
"vectorSearchConfiguration": {
"filter": { ... },
"numberOfResults": number,
"overrideSearchType": "string"
}
},
"retrievalQuery": {
"text": "string"
}
}'
```
#### Direct Bedrock API Call
```bash
curl -X POST "https://bedrock-runtime.us-west-2.amazonaws.com/knowledgebases/{knowledgeBaseId}/retrieve" \
-H 'Authorization: AWS4-HMAC-SHA256..' \
-H 'Content-Type: application/json' \
-d '{
"nextToken": "string",
"retrievalConfiguration": {
"vectorSearchConfiguration": {
"filter": { ... },
"numberOfResults": number,
"overrideSearchType": "string"
}
},
"retrievalQuery": {
"text": "string"
}
}'
```
## Advanced - Use with Virtual Keys
Pre-requisites
- [Setup proxy with DB](../proxy/virtual_keys.md#setup)
Use this, to avoid giving developers the raw AWS Keys, but still letting them use AWS Bedrock endpoints.
### Usage
1. Setup environment
```bash
export DATABASE_URL=""
export LITELLM_MASTER_KEY=""
export AWS_ACCESS_KEY_ID="" # Access key
export AWS_SECRET_ACCESS_KEY="" # Secret access key
export AWS_REGION_NAME="" # us-east-1, us-east-2, us-west-1, us-west-2
```
```bash
litellm
# RUNNING on http://0.0.0.0:4000
```
2. Generate virtual key
```bash
curl -X POST 'http://0.0.0.0:4000/key/generate' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{}'
```
Expected Response
```bash
{
...
"key": "sk-1234ewknldferwedojwojw"
}
```
3. Test it!
```bash
curl -X POST 'http://0.0.0.0:4000/bedrock/model/cohere.command-r-v1:0/converse' \
-H 'Authorization: Bearer sk-1234ewknldferwedojwojw' \
-H 'Content-Type: application/json' \
-d '{
"messages": [
{"role": "user",
"content": [{"text": "Hello"}]
}
]
}'
```

View file

@ -0,0 +1,253 @@
# Cohere API (Pass-Through)
Pass-through endpoints for Cohere - call provider-specific endpoint, in native format (no translation).
Just replace `https://api.cohere.com` with `LITELLM_PROXY_BASE_URL/cohere` 🚀
#### **Example Usage**
```bash
curl --request POST \
--url http://0.0.0.0:4000/cohere/v1/chat \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--header "Authorization: bearer sk-anything" \
--data '{
"chat_history": [
{"role": "USER", "message": "Who discovered gravity?"},
{"role": "CHATBOT", "message": "The man who is widely credited with discovering gravity is Sir Isaac Newton"}
],
"message": "What year was he born?",
"connectors": [{"id": "web-search"}]
}'
```
Supports **ALL** Cohere Endpoints (including streaming).
[**See All Cohere Endpoints**](https://docs.cohere.com/reference/chat)
## Quick Start
Let's call the Cohere [`/rerank` endpoint](https://docs.cohere.com/reference/rerank)
1. Add Cohere API Key to your environment
```bash
export COHERE_API_KEY=""
```
2. Start LiteLLM Proxy
```bash
litellm
# RUNNING on http://0.0.0.0:4000
```
3. Test it!
Let's call the Cohere /rerank endpoint
```bash
curl --request POST \
--url http://0.0.0.0:4000/cohere/v1/rerank \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--header "Authorization: bearer sk-anything" \
--data '{
"model": "rerank-english-v3.0",
"query": "What is the capital of the United States?",
"top_n": 3,
"documents": ["Carson City is the capital city of the American state of Nevada.",
"The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.",
"Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.",
"Capitalization or capitalisation in English grammar is the use of a capital letter at the start of a word. English usage varies from capitalization in other languages.",
"Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states."]
}'
```
## Examples
Anything after `http://0.0.0.0:4000/cohere` is treated as a provider-specific route, and handled accordingly.
Key Changes:
| **Original Endpoint** | **Replace With** |
|------------------------------------------------------|-----------------------------------|
| `https://api.cohere.com` | `http://0.0.0.0:4000/cohere` (LITELLM_PROXY_BASE_URL="http://0.0.0.0:4000") |
| `bearer $CO_API_KEY` | `bearer anything` (use `bearer LITELLM_VIRTUAL_KEY` if Virtual Keys are setup on proxy) |
### **Example 1: Rerank endpoint**
#### LiteLLM Proxy Call
```bash
curl --request POST \
--url http://0.0.0.0:4000/cohere/v1/rerank \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--header "Authorization: bearer sk-anything" \
--data '{
"model": "rerank-english-v3.0",
"query": "What is the capital of the United States?",
"top_n": 3,
"documents": ["Carson City is the capital city of the American state of Nevada.",
"The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.",
"Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.",
"Capitalization or capitalisation in English grammar is the use of a capital letter at the start of a word. English usage varies from capitalization in other languages.",
"Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states."]
}'
```
#### Direct Cohere API Call
```bash
curl --request POST \
--url https://api.cohere.com/v1/rerank \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--header "Authorization: bearer $CO_API_KEY" \
--data '{
"model": "rerank-english-v3.0",
"query": "What is the capital of the United States?",
"top_n": 3,
"documents": ["Carson City is the capital city of the American state of Nevada.",
"The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.",
"Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.",
"Capitalization or capitalisation in English grammar is the use of a capital letter at the start of a word. English usage varies from capitalization in other languages.",
"Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states."]
}'
```
### **Example 2: Chat API**
#### LiteLLM Proxy Call
```bash
curl --request POST \
--url http://0.0.0.0:4000/cohere/v1/chat \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--header "Authorization: bearer sk-anything" \
--data '{
"chat_history": [
{"role": "USER", "message": "Who discovered gravity?"},
{"role": "CHATBOT", "message": "The man who is widely credited with discovering gravity is Sir Isaac Newton"}
],
"message": "What year was he born?",
"connectors": [{"id": "web-search"}]
}'
```
#### Direct Cohere API Call
```bash
curl --request POST \
--url https://api.cohere.com/v1/chat \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--header "Authorization: bearer $CO_API_KEY" \
--data '{
"chat_history": [
{"role": "USER", "message": "Who discovered gravity?"},
{"role": "CHATBOT", "message": "The man who is widely credited with discovering gravity is Sir Isaac Newton"}
],
"message": "What year was he born?",
"connectors": [{"id": "web-search"}]
}'
```
### **Example 3: Embedding**
```bash
curl --request POST \
--url https://api.cohere.com/v1/embed \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--header "Authorization: bearer sk-anything" \
--data '{
"model": "embed-english-v3.0",
"texts": ["hello", "goodbye"],
"input_type": "classification"
}'
```
#### Direct Cohere API Call
```bash
curl --request POST \
--url https://api.cohere.com/v1/embed \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--header "Authorization: bearer $CO_API_KEY" \
--data '{
"model": "embed-english-v3.0",
"texts": ["hello", "goodbye"],
"input_type": "classification"
}'
```
## Advanced - Use with Virtual Keys
Pre-requisites
- [Setup proxy with DB](../proxy/virtual_keys.md#setup)
Use this, to avoid giving developers the raw Cohere API key, but still letting them use Cohere endpoints.
### Usage
1. Setup environment
```bash
export DATABASE_URL=""
export LITELLM_MASTER_KEY=""
export COHERE_API_KEY=""
```
```bash
litellm
# RUNNING on http://0.0.0.0:4000
```
2. Generate virtual key
```bash
curl -X POST 'http://0.0.0.0:4000/key/generate' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{}'
```
Expected Response
```bash
{
...
"key": "sk-1234ewknldferwedojwojw"
}
```
3. Test it!
```bash
curl --request POST \
--url http://0.0.0.0:4000/cohere/v1/rerank \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--header "Authorization: bearer sk-1234ewknldferwedojwojw" \
--data '{
"model": "rerank-english-v3.0",
"query": "What is the capital of the United States?",
"top_n": 3,
"documents": ["Carson City is the capital city of the American state of Nevada.",
"The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.",
"Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.",
"Capitalization or capitalisation in English grammar is the use of a capital letter at the start of a word. English usage varies from capitalization in other languages.",
"Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states."]
}'
```

View file

@ -193,7 +193,9 @@ const sidebars = {
"fine_tuning",
"anthropic_completion",
"pass_through/vertex_ai",
"pass_through/google_ai_studio"
"pass_through/google_ai_studio",
"pass_through/cohere",
"pass_through/bedrock"
],
},
"scheduler",