(docs) using /embeddings with Proxy

2024-01-01 12:31:13 +05:30 · 2024-01-01 12:31:13 +05:30 · cf902a53b4
commit cf902a53b4
parent c8f8bd9e57
3 changed files with 232 additions and 191 deletions
--- a/docs/my-website/docs/proxy/configs.md
+++ b/docs/my-website/docs/proxy/configs.md
@ -307,6 +307,126 @@ model_list:
 $ litellm --config /path/to/config.yaml
 ```

+## Setting Embedding Models 
+
+See supported Embedding Providers & Models [here](https://docs.litellm.ai/docs/embedding/supported_embedding)
+
+### Use Sagemaker, Bedrock, Azure, OpenAI, XInference
+#### Create Config.yaml
+
+<Tabs>
+
+<TabItem value="sagemaker" label="Sagemaker, Bedrock Embeddings">
+
+Here's how to route between GPT-J embedding (sagemaker endpoint), Amazon Titan embedding (Bedrock) and Azure OpenAI embedding on the proxy server: 
+
+```yaml
+model_list:
+  - model_name: sagemaker-embeddings
+    litellm_params: 
+      model: "sagemaker/berri-benchmarking-gpt-j-6b-fp16"
+  - model_name: amazon-embeddings
+    litellm_params:
+      model: "bedrock/amazon.titan-embed-text-v1"
+  - model_name: azure-embeddings
+    litellm_params: 
+      model: "azure/azure-embedding-model"
+      api_base: "os.environ/AZURE_API_BASE" # os.getenv("AZURE_API_BASE")
+      api_key: "os.environ/AZURE_API_KEY" # os.getenv("AZURE_API_KEY")
+      api_version: "2023-07-01-preview"
+
+general_settings:
+  master_key: sk-1234 # [OPTIONAL] if set all calls to proxy will require either this key or a valid generated token
+```
+
+</TabItem>
+
+<TabItem value="Hugging Face emb" label="Hugging Face Embeddings">
+LiteLLM Proxy supports all <a href="https://huggingface.co/models?pipeline_tag=feature-extraction">Feature-Extraction Embedding models</a>.
+
+```yaml
+model_list:
+  - model_name: deployed-codebert-base
+    litellm_params: 
+      # send request to deployed hugging face inference endpoint
+      model: huggingface/microsoft/codebert-base # add huggingface prefix so it routes to hugging face
+      api_key: hf_LdS                            # api key for hugging face inference endpoint
+      api_base: https://uysneno1wv2wd4lw.us-east-1.aws.endpoints.huggingface.cloud # your hf inference endpoint 
+  - model_name: codebert-base
+    litellm_params: 
+      # no api_base set, sends request to hugging face free inference api https://api-inference.huggingface.co/models/
+      model: huggingface/microsoft/codebert-base # add huggingface prefix so it routes to hugging face
+      api_key: hf_LdS                            # api key for hugging face                     
+
+```
+
+</TabItem>
+
+<TabItem value="azure" label="Azure OpenAI Embeddings">
+
+```yaml
+model_list:
+  - model_name: azure-embedding-model # model group
+    litellm_params:
+      model: azure/azure-embedding-model # model name for litellm.embedding(model=azure/azure-embedding-model) call
+      api_base: your-azure-api-base
+      api_key: your-api-key
+      api_version: 2023-07-01-preview
+```
+
+</TabItem>
+
+<TabItem value="openai" label="OpenAI Embeddings">
+
+```yaml
+model_list:
+- model_name: text-embedding-ada-002 # model group
+  litellm_params:
+    model: text-embedding-ada-002 # model name for litellm.embedding(model=text-embedding-ada-002) 
+    api_key: your-api-key-1
+- model_name: text-embedding-ada-002 
+  litellm_params:
+    model: text-embedding-ada-002
+    api_key: your-api-key-2
+```
+
+</TabItem>
+
+<TabItem value="openai emb" label="OpenAI Compatible Embeddings">
+
+<p>Use this for calling <a href="https://github.com/xorbitsai/inference">/embedding endpoints on OpenAI Compatible Servers</a>.</p>
+
+**Note add `openai/` prefix to `litellm_params`: `model` so litellm knows to route to OpenAI**
+
+```yaml
+model_list:
+- model_name: text-embedding-ada-002  # model group
+  litellm_params:
+    model: openai/<your-model-name>   # model name for litellm.embedding(model=text-embedding-ada-002) 
+    api_base: <model-api-base>
+```
+
+</TabItem>
+</Tabs>
+
+#### Start Proxy
+```shell
+litellm --config config.yaml
+```
+
+#### Make Request
+Sends Request to `deployed-codebert-base`
+
+```shell
+curl --location 'http://0.0.0.0:8000/embeddings' \
+  --header 'Content-Type: application/json' \
+  --data ' {
+  "model": "deployed-codebert-base",
+  "input": ["write a litellm poem"]
+  }'
+```
+
+
 ## Router Settings 

 Use this to configure things like routing strategy. 
--- a/docs/my-website/docs/proxy/embedding.md
+++ b/docs/my-website/docs/proxy/embedding.md
@ -47,196 +47,9 @@ curl --location 'http://0.0.0.0:8000/v1/embeddings' \
 }'
 ```

-## `/embeddings` Request Format
-Input, Output and Exceptions are mapped to the OpenAI format for all supported models
-
-<Tabs>
-<TabItem value="Curl" label="Curl Request">
-
-```shell
-curl --location 'http://0.0.0.0:8000/embeddings' \
-  --header 'Content-Type: application/json' \
-  --data ' {
-  "model": "text-embedding-ada-002",
-  "input": ["write a litellm poem"]
-  }'
-```
-</TabItem>
-<TabItem value="openai" label="OpenAI v1.0.0+">
-
-```python
-import openai
-from openai import OpenAI
-
-# set base_url to your proxy server
-# set api_key to send to proxy server
-client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:8000")
-
-response = openai.embeddings.create(
-    input=["hello from litellm"],
-    model="text-embedding-ada-002"
-)
-
-print(response)
-
-```
-</TabItem>
-
-<TabItem value="langchain-embedding" label="Langchain Embeddings">
-
-```python
-from langchain.embeddings import OpenAIEmbeddings
-
-embeddings = OpenAIEmbeddings(model="sagemaker-embeddings", openai_api_base="http://0.0.0.0:8000", openai_api_key="temp-key")
-
-
-text = "This is a test document."
-
-query_result = embeddings.embed_query(text)
-
-print(f"SAGEMAKER EMBEDDINGS")
-print(query_result[:5])
-
-embeddings = OpenAIEmbeddings(model="bedrock-embeddings", openai_api_base="http://0.0.0.0:8000", openai_api_key="temp-key")
-
-text = "This is a test document."
-
-query_result = embeddings.embed_query(text)
-
-print(f"BEDROCK EMBEDDINGS")
-print(query_result[:5])
-
-embeddings = OpenAIEmbeddings(model="bedrock-titan-embeddings", openai_api_base="http://0.0.0.0:8000", openai_api_key="temp-key")
-
-text = "This is a test document."
-
-query_result = embeddings.embed_query(text)
-
-print(f"TITAN EMBEDDINGS")
-print(query_result[:5])
-```
-</TabItem>
-</Tabs>
-
-
-
-## `/embeddings` Response Format
-
-```json
-{
-  "object": "list",
-  "data": [
-    {
-      "object": "embedding",
-      "embedding": [
-        0.0023064255,
-        -0.009327292,
-        .... 
-        -0.0028842222,
-      ],
-      "index": 0
-    }
-  ],
-  "model": "text-embedding-ada-002",
-  "usage": {
-    "prompt_tokens": 8,
-    "total_tokens": 8
-  }
-}
-
-```
-
-## Supported Models
-
-See supported Embedding Providers & Models [here](https://docs.litellm.ai/docs/embedding/supported_embedding)
-
-#### Create Config.yaml
-
-<Tabs>
-<TabItem value="Hugging Face emb" label="Hugging Face Embeddings">
-LiteLLM Proxy supports all <a href="https://huggingface.co/models?pipeline_tag=feature-extraction">Feature-Extraction Embedding models</a>.
-
-```yaml
-model_list:
-  - model_name: deployed-codebert-base
-    litellm_params: 
-      # send request to deployed hugging face inference endpoint
-      model: huggingface/microsoft/codebert-base # add huggingface prefix so it routes to hugging face
-      api_key: hf_LdS                            # api key for hugging face inference endpoint
-      api_base: https://uysneno1wv2wd4lw.us-east-1.aws.endpoints.huggingface.cloud # your hf inference endpoint 
-  - model_name: codebert-base
-    litellm_params: 
-      # no api_base set, sends request to hugging face free inference api https://api-inference.huggingface.co/models/
-      model: huggingface/microsoft/codebert-base # add huggingface prefix so it routes to hugging face
-      api_key: hf_LdS                            # api key for hugging face                     
-
-```
-
-</TabItem>
-
-<TabItem value="azure" label="Azure OpenAI Embeddings">
-
-```yaml
-model_list:
-  - model_name: azure-embedding-model # model group
-    litellm_params:
-      model: azure/azure-embedding-model # model name for litellm.embedding(model=azure/azure-embedding-model) call
-      api_base: your-azure-api-base
-      api_key: your-api-key
-      api_version: 2023-07-01-preview
-```
-
-</TabItem>
-
-<TabItem value="openai" label="OpenAI Embeddings">
-
-```yaml
-model_list:
- model_name: text-embedding-ada-002 # model group
-  litellm_params:
-    model: text-embedding-ada-002 # model name for litellm.embedding(model=text-embedding-ada-002) 
-    api_key: your-api-key-1
- model_name: text-embedding-ada-002 
-  litellm_params:
-    model: text-embedding-ada-002
-    api_key: your-api-key-2
-```
-
-</TabItem>
-
-<TabItem value="openai emb" label="OpenAI Compatible Embeddings">
-
-<p>Use this for calling <a href="https://github.com/xorbitsai/inference">/embedding endpoints on OpenAI Compatible Servers</a>.</p>
-
-**Note add `openai/` prefix to `litellm_params`: `model` so litellm knows to route to OpenAI**
-
-```yaml
-model_list:
- model_name: text-embedding-ada-002  # model group
-  litellm_params:
-    model: openai/<your-model-name>   # model name for litellm.embedding(model=text-embedding-ada-002) 
-    api_base: <model-api-base>
-```
-
-</TabItem>
-</Tabs>
-
-#### Start Proxy
-```shell
-litellm --config config.yaml
-```
-
-#### Make Request
-Sends Request to `deployed-codebert-base`
-
-```shell
-curl --location 'http://0.0.0.0:8000/embeddings' \
-  --header 'Content-Type: application/json' \
-  --data ' {
-  "model": "deployed-codebert-base",
-  "input": ["write a litellm poem"]
-  }'
-```
+
+
+



--- a/docs/my-website/docs/proxy/user_keys.md
+++ b/docs/my-website/docs/proxy/user_keys.md
@ -3,6 +3,12 @@ import TabItem from '@theme/TabItem';

 # Use with Langchain, OpenAI SDK, Curl

+:::info
+
+**Input, Output, Exceptions are mapped to the OpenAI format for all supported models**
+
+:::
+
 How to send requests to the proxy, pass metadata, allow users to pass in their OpenAI API key

 ## `/chat/completions`
@ -139,7 +145,109 @@ print(response)

 ```

-## Pass User LLM API Keys
+## `/embeddings`
+
+### Request Format
+Input, Output and Exceptions are mapped to the OpenAI format for all supported models
+
+<Tabs>
+<TabItem value="openai" label="OpenAI Python v1.0.0+">
+
+```python
+import openai
+from openai import OpenAI
+
+# set base_url to your proxy server
+# set api_key to send to proxy server
+client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:8000")
+
+response = openai.embeddings.create(
+    input=["hello from litellm"],
+    model="text-embedding-ada-002"
+)
+
+print(response)
+
+```
+</TabItem>
+<TabItem value="Curl" label="Curl Request">
+
+```shell
+curl --location 'http://0.0.0.0:8000/embeddings' \
+  --header 'Content-Type: application/json' \
+  --data ' {
+  "model": "text-embedding-ada-002",
+  "input": ["write a litellm poem"]
+  }'
+```
+</TabItem>
+
+<TabItem value="langchain-embedding" label="Langchain Embeddings">
+
+```python
+from langchain.embeddings import OpenAIEmbeddings
+
+embeddings = OpenAIEmbeddings(model="sagemaker-embeddings", openai_api_base="http://0.0.0.0:8000", openai_api_key="temp-key")
+
+
+text = "This is a test document."
+
+query_result = embeddings.embed_query(text)
+
+print(f"SAGEMAKER EMBEDDINGS")
+print(query_result[:5])
+
+embeddings = OpenAIEmbeddings(model="bedrock-embeddings", openai_api_base="http://0.0.0.0:8000", openai_api_key="temp-key")
+
+text = "This is a test document."
+
+query_result = embeddings.embed_query(text)
+
+print(f"BEDROCK EMBEDDINGS")
+print(query_result[:5])
+
+embeddings = OpenAIEmbeddings(model="bedrock-titan-embeddings", openai_api_base="http://0.0.0.0:8000", openai_api_key="temp-key")
+
+text = "This is a test document."
+
+query_result = embeddings.embed_query(text)
+
+print(f"TITAN EMBEDDINGS")
+print(query_result[:5])
+```
+</TabItem>
+</Tabs>
+
+
+### Response Format
+
+```json
+{
+  "object": "list",
+  "data": [
+    {
+      "object": "embedding",
+      "embedding": [
+        0.0023064255,
+        -0.009327292,
+        .... 
+        -0.0028842222,
+      ],
+      "index": 0
+    }
+  ],
+  "model": "text-embedding-ada-002",
+  "usage": {
+    "prompt_tokens": 8,
+    "total_tokens": 8
+  }
+}
+
+```
+
+
+## Advanced
+### Pass User LLM API Keys
 Allows your users to pass in their OpenAI API key (any LiteLLM supported provider) to make requests 

 Here's how to do it: