diff --git a/docs/my-website/docs/proxy/configs.md b/docs/my-website/docs/proxy/configs.md index eea8e7681..affe666ca 100644 --- a/docs/my-website/docs/proxy/configs.md +++ b/docs/my-website/docs/proxy/configs.md @@ -307,6 +307,126 @@ model_list: $ litellm --config /path/to/config.yaml ``` +## Setting Embedding Models + +See supported Embedding Providers & Models [here](https://docs.litellm.ai/docs/embedding/supported_embedding) + +### Use Sagemaker, Bedrock, Azure, OpenAI, XInference +#### Create Config.yaml + + + + + +Here's how to route between GPT-J embedding (sagemaker endpoint), Amazon Titan embedding (Bedrock) and Azure OpenAI embedding on the proxy server: + +```yaml +model_list: + - model_name: sagemaker-embeddings + litellm_params: + model: "sagemaker/berri-benchmarking-gpt-j-6b-fp16" + - model_name: amazon-embeddings + litellm_params: + model: "bedrock/amazon.titan-embed-text-v1" + - model_name: azure-embeddings + litellm_params: + model: "azure/azure-embedding-model" + api_base: "os.environ/AZURE_API_BASE" # os.getenv("AZURE_API_BASE") + api_key: "os.environ/AZURE_API_KEY" # os.getenv("AZURE_API_KEY") + api_version: "2023-07-01-preview" + +general_settings: + master_key: sk-1234 # [OPTIONAL] if set all calls to proxy will require either this key or a valid generated token +``` + + + + +LiteLLM Proxy supports all Feature-Extraction Embedding models. + +```yaml +model_list: + - model_name: deployed-codebert-base + litellm_params: + # send request to deployed hugging face inference endpoint + model: huggingface/microsoft/codebert-base # add huggingface prefix so it routes to hugging face + api_key: hf_LdS # api key for hugging face inference endpoint + api_base: https://uysneno1wv2wd4lw.us-east-1.aws.endpoints.huggingface.cloud # your hf inference endpoint + - model_name: codebert-base + litellm_params: + # no api_base set, sends request to hugging face free inference api https://api-inference.huggingface.co/models/ + model: huggingface/microsoft/codebert-base # add huggingface prefix so it routes to hugging face + api_key: hf_LdS # api key for hugging face + +``` + + + + + +```yaml +model_list: + - model_name: azure-embedding-model # model group + litellm_params: + model: azure/azure-embedding-model # model name for litellm.embedding(model=azure/azure-embedding-model) call + api_base: your-azure-api-base + api_key: your-api-key + api_version: 2023-07-01-preview +``` + + + + + +```yaml +model_list: +- model_name: text-embedding-ada-002 # model group + litellm_params: + model: text-embedding-ada-002 # model name for litellm.embedding(model=text-embedding-ada-002) + api_key: your-api-key-1 +- model_name: text-embedding-ada-002 + litellm_params: + model: text-embedding-ada-002 + api_key: your-api-key-2 +``` + + + + + +

Use this for calling /embedding endpoints on OpenAI Compatible Servers.

+ +**Note add `openai/` prefix to `litellm_params`: `model` so litellm knows to route to OpenAI** + +```yaml +model_list: +- model_name: text-embedding-ada-002 # model group + litellm_params: + model: openai/ # model name for litellm.embedding(model=text-embedding-ada-002) + api_base: +``` + +
+
+ +#### Start Proxy +```shell +litellm --config config.yaml +``` + +#### Make Request +Sends Request to `deployed-codebert-base` + +```shell +curl --location 'http://0.0.0.0:8000/embeddings' \ + --header 'Content-Type: application/json' \ + --data ' { + "model": "deployed-codebert-base", + "input": ["write a litellm poem"] + }' +``` + + ## Router Settings Use this to configure things like routing strategy. diff --git a/docs/my-website/docs/proxy/embedding.md b/docs/my-website/docs/proxy/embedding.md index e1a7677f9..0f3a01a90 100644 --- a/docs/my-website/docs/proxy/embedding.md +++ b/docs/my-website/docs/proxy/embedding.md @@ -47,196 +47,9 @@ curl --location 'http://0.0.0.0:8000/v1/embeddings' \ }' ``` -## `/embeddings` Request Format -Input, Output and Exceptions are mapped to the OpenAI format for all supported models - - - - -```shell -curl --location 'http://0.0.0.0:8000/embeddings' \ - --header 'Content-Type: application/json' \ - --data ' { - "model": "text-embedding-ada-002", - "input": ["write a litellm poem"] - }' -``` - - - -```python -import openai -from openai import OpenAI - -# set base_url to your proxy server -# set api_key to send to proxy server -client = OpenAI(api_key="", base_url="http://0.0.0.0:8000") - -response = openai.embeddings.create( - input=["hello from litellm"], - model="text-embedding-ada-002" -) - -print(response) - -``` - - - - -```python -from langchain.embeddings import OpenAIEmbeddings - -embeddings = OpenAIEmbeddings(model="sagemaker-embeddings", openai_api_base="http://0.0.0.0:8000", openai_api_key="temp-key") - - -text = "This is a test document." - -query_result = embeddings.embed_query(text) - -print(f"SAGEMAKER EMBEDDINGS") -print(query_result[:5]) - -embeddings = OpenAIEmbeddings(model="bedrock-embeddings", openai_api_base="http://0.0.0.0:8000", openai_api_key="temp-key") - -text = "This is a test document." - -query_result = embeddings.embed_query(text) - -print(f"BEDROCK EMBEDDINGS") -print(query_result[:5]) - -embeddings = OpenAIEmbeddings(model="bedrock-titan-embeddings", openai_api_base="http://0.0.0.0:8000", openai_api_key="temp-key") - -text = "This is a test document." - -query_result = embeddings.embed_query(text) - -print(f"TITAN EMBEDDINGS") -print(query_result[:5]) -``` - - - - - -## `/embeddings` Response Format - -```json -{ - "object": "list", - "data": [ - { - "object": "embedding", - "embedding": [ - 0.0023064255, - -0.009327292, - .... - -0.0028842222, - ], - "index": 0 - } - ], - "model": "text-embedding-ada-002", - "usage": { - "prompt_tokens": 8, - "total_tokens": 8 - } -} - -``` - -## Supported Models - -See supported Embedding Providers & Models [here](https://docs.litellm.ai/docs/embedding/supported_embedding) - -#### Create Config.yaml - - - -LiteLLM Proxy supports all Feature-Extraction Embedding models. - -```yaml -model_list: - - model_name: deployed-codebert-base - litellm_params: - # send request to deployed hugging face inference endpoint - model: huggingface/microsoft/codebert-base # add huggingface prefix so it routes to hugging face - api_key: hf_LdS # api key for hugging face inference endpoint - api_base: https://uysneno1wv2wd4lw.us-east-1.aws.endpoints.huggingface.cloud # your hf inference endpoint - - model_name: codebert-base - litellm_params: - # no api_base set, sends request to hugging face free inference api https://api-inference.huggingface.co/models/ - model: huggingface/microsoft/codebert-base # add huggingface prefix so it routes to hugging face - api_key: hf_LdS # api key for hugging face - -``` - - - - - -```yaml -model_list: - - model_name: azure-embedding-model # model group - litellm_params: - model: azure/azure-embedding-model # model name for litellm.embedding(model=azure/azure-embedding-model) call - api_base: your-azure-api-base - api_key: your-api-key - api_version: 2023-07-01-preview -``` - - - - - -```yaml -model_list: -- model_name: text-embedding-ada-002 # model group - litellm_params: - model: text-embedding-ada-002 # model name for litellm.embedding(model=text-embedding-ada-002) - api_key: your-api-key-1 -- model_name: text-embedding-ada-002 - litellm_params: - model: text-embedding-ada-002 - api_key: your-api-key-2 -``` - - - - - -

Use this for calling /embedding endpoints on OpenAI Compatible Servers.

- -**Note add `openai/` prefix to `litellm_params`: `model` so litellm knows to route to OpenAI** - -```yaml -model_list: -- model_name: text-embedding-ada-002 # model group - litellm_params: - model: openai/ # model name for litellm.embedding(model=text-embedding-ada-002) - api_base: -``` - -
-
- -#### Start Proxy -```shell -litellm --config config.yaml -``` - -#### Make Request -Sends Request to `deployed-codebert-base` - -```shell -curl --location 'http://0.0.0.0:8000/embeddings' \ - --header 'Content-Type: application/json' \ - --data ' { - "model": "deployed-codebert-base", - "input": ["write a litellm poem"] - }' -``` + + + diff --git a/docs/my-website/docs/proxy/user_keys.md b/docs/my-website/docs/proxy/user_keys.md index 9136452f5..12e1c766f 100644 --- a/docs/my-website/docs/proxy/user_keys.md +++ b/docs/my-website/docs/proxy/user_keys.md @@ -3,6 +3,12 @@ import TabItem from '@theme/TabItem'; # Use with Langchain, OpenAI SDK, Curl +:::info + +**Input, Output, Exceptions are mapped to the OpenAI format for all supported models** + +::: + How to send requests to the proxy, pass metadata, allow users to pass in their OpenAI API key ## `/chat/completions` @@ -139,7 +145,109 @@ print(response) ``` -## Pass User LLM API Keys +## `/embeddings` + +### Request Format +Input, Output and Exceptions are mapped to the OpenAI format for all supported models + + + + +```python +import openai +from openai import OpenAI + +# set base_url to your proxy server +# set api_key to send to proxy server +client = OpenAI(api_key="", base_url="http://0.0.0.0:8000") + +response = openai.embeddings.create( + input=["hello from litellm"], + model="text-embedding-ada-002" +) + +print(response) + +``` + + + +```shell +curl --location 'http://0.0.0.0:8000/embeddings' \ + --header 'Content-Type: application/json' \ + --data ' { + "model": "text-embedding-ada-002", + "input": ["write a litellm poem"] + }' +``` + + + + +```python +from langchain.embeddings import OpenAIEmbeddings + +embeddings = OpenAIEmbeddings(model="sagemaker-embeddings", openai_api_base="http://0.0.0.0:8000", openai_api_key="temp-key") + + +text = "This is a test document." + +query_result = embeddings.embed_query(text) + +print(f"SAGEMAKER EMBEDDINGS") +print(query_result[:5]) + +embeddings = OpenAIEmbeddings(model="bedrock-embeddings", openai_api_base="http://0.0.0.0:8000", openai_api_key="temp-key") + +text = "This is a test document." + +query_result = embeddings.embed_query(text) + +print(f"BEDROCK EMBEDDINGS") +print(query_result[:5]) + +embeddings = OpenAIEmbeddings(model="bedrock-titan-embeddings", openai_api_base="http://0.0.0.0:8000", openai_api_key="temp-key") + +text = "This is a test document." + +query_result = embeddings.embed_query(text) + +print(f"TITAN EMBEDDINGS") +print(query_result[:5]) +``` + + + + +### Response Format + +```json +{ + "object": "list", + "data": [ + { + "object": "embedding", + "embedding": [ + 0.0023064255, + -0.009327292, + .... + -0.0028842222, + ], + "index": 0 + } + ], + "model": "text-embedding-ada-002", + "usage": { + "prompt_tokens": 8, + "total_tokens": 8 + } +} + +``` + + +## Advanced +### Pass User LLM API Keys Allows your users to pass in their OpenAI API key (any LiteLLM supported provider) to make requests Here's how to do it: