docs

2023-09-30 16:43:10 -07:00 · 2023-09-30 16:43:10 -07:00 · 45983a3074
commit 45983a3074
parent a68e975f06
1 changed files with 149 additions and 17 deletions
--- a/docs/my-website/docs/proxy_server.md
+++ b/docs/my-website/docs/proxy_server.md
@ -3,24 +3,16 @@ import TabItem from '@theme/TabItem';

 # OpenAI Proxy Server

-Use this to spin up a proxy api to translate openai api calls to any non-openai model (e.g. Huggingface, TogetherAI, Ollama, etc.)
+CLI Tool to create a LLM Proxy Server to translate openai api calls to any non-openai model (e.g. Huggingface, TogetherAI, Ollama, etc.) 100+ models [Provider List](https://docs.litellm.ai/docs/providers).

-This works for async + streaming as well. 
-
-Works with **ALL MODELS** supported by LiteLLM. To see supported providers check out this list - [Provider List](https://docs.litellm.ai/docs/providers).
-
-**Requirements** Make sure relevant keys are set in the local .env. 
-
-[**Jump to tutorial**](#tutorial---using-with-aider)
-## quick start
+## Quick start
 Call Huggingface models through your OpenAI proxy.

-**Start Proxy**  
-Run this in your CLI.
-```python
+### Start Proxy
+```shell
 $ pip install litellm
 ```
-```python 
+```shell 
 $ litellm --model huggingface/bigcode/starcoder

 #INFO:     Uvicorn running on http://0.0.0.0:8000
@ -28,8 +20,16 @@ $ litellm --model huggingface/bigcode/starcoder

 This will host a local proxy api at: **http://0.0.0.0:8000**

-**Test it**
+### Test Proxy
+Make a test ChatCompletion Request to your proxy
 <Tabs>
+<TabItem value="litellm" label="litellm cli">
+
+```shell
+litellm --test http://0.0.0.0:8000
+```
+
+</TabItem>
 <TabItem value="openai" label="OpenAI">

 ```python
@ -59,7 +59,7 @@ curl --location 'http://0.0.0.0:8000/chat/completions' \
 </TabItem>
 </Tabs>

-Other supported models:
+#### Other supported models:
 <Tabs>
 <TabItem value="anthropic" label="Anthropic">

@ -149,7 +149,139 @@ $ litellm --model command-nightly

 [**Jump to Code**](https://github.com/BerriAI/litellm/blob/fef4146396d5d87006259e00095a62e3900d6bb4/litellm/proxy.py#L36)

-## setting api base, temperature, max tokens
+
+### Deploy Proxy
+Deploy the proxy to https://api.litellm.ai
+
+```shell 
+$ export ANTHROPIC_API_KEY=sk-ant-api03-1..
+$ litellm --model claude-instant-1 --deploy
+
+#INFO:     Uvicorn running on https://api.litellm.ai/44508ad4
+```
+
+This will host a ChatCompletions API at: https://api.litellm.ai/44508ad4
+
+#### Other supported models:
+<Tabs>
+<TabItem value="anthropic" label="Anthropic">
+
+```shell
+$ export ANTHROPIC_API_KEY=my-api-key
+$ litellm --model claude-instant-1 --deploy
+```
+
+</TabItem>
+
+<TabItem value="together_ai" label="TogetherAI">
+
+```shell
+$ export TOGETHERAI_API_KEY=my-api-key
+$ litellm --model together_ai/lmsys/vicuna-13b-v1.5-16k --deploy
+```
+
+</TabItem>
+
+<TabItem value="replicate" label="Replicate">
+
+```shell
+$ export REPLICATE_API_KEY=my-api-key
+$ litellm \
+  --model replicate/meta/llama-2-70b-chat:02e509c789964a7ea8736978a43525956ef40397be9033abf9fd2badfe68c9e3
+  --deploy
+```
+
+</TabItem>
+
+<TabItem value="petals" label="Petals">
+
+```shell
+$ litellm --model petals/meta-llama/Llama-2-70b-chat-hf --deploy
+```
+
+</TabItem>
+
+<TabItem value="palm" label="Palm">
+
+```shell
+$ export PALM_API_KEY=my-palm-key
+$ litellm --model palm/chat-bison --deploy
+```
+
+</TabItem>
+
+<TabItem value="azure" label="Azure OpenAI">
+
+```shell
+$ export AZURE_API_KEY=my-api-key
+$ export AZURE_API_BASE=my-api-base
+$ export AZURE_API_VERSION=my-api-version
+
+$ litellm --model azure/my-deployment-id --deploy
+```
+
+</TabItem>
+
+<TabItem value="ai21" label="AI21">
+
+```shell
+$ export AI21_API_KEY=my-api-key
+$ litellm --model j2-light --deploy
+```
+
+</TabItem>
+
+<TabItem value="cohere" label="Cohere">
+
+```shell
+$ export COHERE_API_KEY=my-api-key
+$ litellm --model command-nightly --deploy
+```
+
+</TabItem>
+
+</Tabs>
+
+### Test Deployed Proxy
+Make a test ChatCompletion Request to your proxy
+<Tabs>
+<TabItem value="litellm" label="litellm cli">
+
+```shell
+litellm --test https://api.litellm.ai/44508ad4
+```
+
+</TabItem>
+<TabItem value="openai" label="OpenAI">
+
+```python
+import openai 
+
+openai.api_base = "https://api.litellm.ai/44508ad4"
+
+print(openai.ChatCompletion.create(model="test", messages=[{"role":"user", "content":"Hey!"}]))
+```
+
+</TabItem>
+
+<TabItem value="curl" label="curl">
+
+```curl 
+curl --location 'https://api.litellm.ai/44508ad4/chat/completions' \
+--header 'Content-Type: application/json' \
+--data '{
+  "messages": [
+    {
+      "role": "user", 
+      "content": "what do you know?"
+    }
+  ], 
+}'
+```
+</TabItem>
+</Tabs>
+
+## Setting api base, temperature, max tokens

 ```shell
 litellm --model huggingface/bigcode/starcoder \
@ -164,7 +296,7 @@ litellm --model huggingface/bigcode/starcoder \
 $ litellm --model ollama/llama2 --api_base http://localhost:11434
 ```

-## tutorial - using with aider 
+## Tutorial - using HuggingFace LLMs with aider 
 [Aider](https://github.com/paul-gauthier/aider) is an AI pair programming in your terminal.

 But it only accepts OpenAI API Calls.