(docs) proxy

2023-11-07 09:48:58 -08:00 · 2023-11-07 09:48:58 -08:00 · 87d0b72a4a
commit 87d0b72a4a
parent 50564ab38e
1 changed files with 3 additions and 168 deletions
--- a/docs/my-website/docs/simple_proxy.md
+++ b/docs/my-website/docs/simple_proxy.md
@ -4,20 +4,11 @@ import TabItem from '@theme/TabItem';

 # 💥 Evaluate LLMs - OpenAI Proxy Server

-A simple, fast, and lightweight **OpenAI-compatible server** to call 100+ LLM APIs.
-
 LiteLLM Server supports:

-* Call [Huggingface/Bedrock/TogetherAI/etc.](#other-supported-models) in the OpenAI ChatCompletions format
-* Set custom prompt templates + model-specific configs (temperature, max_tokens, etc.)
-* Caching (In-memory + Redis)
-
-[**See Code**](https://github.com/BerriAI/litellm/tree/main/litellm_server)
-
-:::info
-We want to learn how we can make the server better! Meet the [founders](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version) or
-join our [discord](https://discord.gg/wuPM9dRgDw)
-::: 
+* Call Call 100+ LLMs [Huggingface/Bedrock/TogetherAI/etc.](#other-supported-models) in the OpenAI `ChatCompletions` & `Completions` format
+* Set custom prompt templates + model-specific configs (`temperature`, `max_tokens`, etc.)
+* Caching Responses

 ## Quick Start 

@ -347,162 +338,6 @@ $ cd ./litellm/litellm_server
 $ uvicorn main:app --host 0.0.0.0 --port 8000
 ```

-## Setting LLM API keys
-This server allows two ways of passing API keys to litellm
- Environment Variables - This server by default assumes the LLM API Keys are stored in the environment variables
- Dynamic Variables passed to `/chat/completions`
-  - Set `AUTH_STRATEGY=DYNAMIC` in the Environment 
-  - Pass required auth params `api_key`,`api_base`, `api_version` with the request params
-
-
-<Tabs>
-<TabItem value="gcp-run" label="Google Cloud Run">
-
-#### Deploy on Google Cloud Run
-**Click the button** to deploy to Google Cloud Run
-
-[![Deploy](https://deploy.cloud.run/button.svg)](https://l.linklyhq.com/l/1uHtX)
-
-On a successfull deploy your Cloud Run Shell will have this output
-<Image img={require('../img/cloud_run0.png')} />
-
-### Testing your deployed server
-**Assuming the required keys are set as Environment Variables**
-
-https://litellm-7yjrj3ha2q-uc.a.run.app is our example server, substitute it with your deployed cloud run app
-
-<Tabs>
-<TabItem value="openai" label="OpenAI">
-
-```shell
-curl https://litellm-7yjrj3ha2q-uc.a.run.app/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d '{
-     "model": "gpt-3.5-turbo",
-     "messages": [{"role": "user", "content": "Say this is a test!"}],
-     "temperature": 0.7
-   }'
-```
-
-</TabItem>
-<TabItem value="azure" label="Azure">
-
-```shell
-curl https://litellm-7yjrj3ha2q-uc.a.run.app/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d '{
-     "model": "azure/<your-deployment-name>",
-     "messages": [{"role": "user", "content": "Say this is a test!"}],
-     "temperature": 0.7
-   }'
-```
-
-</TabItem>
-
-<TabItem value="anthropic" label="Anthropic">
-
-```shell
-curl https://litellm-7yjrj3ha2q-uc.a.run.app/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d '{
-     "model": "claude-2",
-     "messages": [{"role": "user", "content": "Say this is a test!"}],
-     "temperature": 0.7,
-   }'
-```
-</TabItem>
-
-</Tabs>
-
-### Set LLM API Keys
-#### Environment Variables 
-More info [here](https://cloud.google.com/run/docs/configuring/services/environment-variables#console)
-
-1. In the Google Cloud console, go to Cloud Run: [Go to Cloud Run](https://console.cloud.google.com/run)
-
-2. Click on the **litellm** service
-<Image img={require('../img/cloud_run1.png')} />
-
-3. Click **Edit and Deploy New Revision**
-<Image img={require('../img/cloud_run2.png')} />
-
-4. Enter your Environment Variables
-Example `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`
-<Image img={require('../img/cloud_run3.png')} />
-
-</TabItem>
-<TabItem value="render" label="Render">
-
-#### Deploy on Render
-**Click the button** to deploy to Render
-
-[![Deploy](https://render.com/images/deploy-to-render-button.svg)](https://l.linklyhq.com/l/1uHsr)
-
-On a successfull deploy https://dashboard.render.com/ should display the following
-<Image img={require('../img/render1.png')} />
-
-<Image img={require('../img/render2.png')} />
-</TabItem>
-<TabItem value="aws-apprunner" label="AWS Apprunner">
-
-#### Deploy on AWS Apprunner
-1. Fork LiteLLM https://github.com/BerriAI/litellm 
-2. Navigate to to App Runner on AWS Console: https://console.aws.amazon.com/apprunner/home#/services
-3. Follow the steps in the video below
-<iframe width="800" height="450" src="https://www.loom.com/embed/5fccced4dde8461a8caeee97addb2231?sid=eac60660-073e-455e-a737-b3d05a5a756a" frameborder="0" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>
-
-4. Testing your deployed endpoint
-
-  **Assuming the required keys are set as Environment Variables** Example: `OPENAI_API_KEY`
-
-  https://b2w6emmkzp.us-east-1.awsapprunner.com is our example server, substitute it with your deployed apprunner endpoint
-
-  <Tabs>
-  <TabItem value="openai" label="OpenAI">
-
-  ```shell
-  curl https://b2w6emmkzp.us-east-1.awsapprunner.com/v1/chat/completions \
-    -H "Content-Type: application/json" \
-    -d '{
-      "model": "gpt-3.5-turbo",
-      "messages": [{"role": "user", "content": "Say this is a test!"}],
-      "temperature": 0.7
-    }'
-  ```
-
-  </TabItem>
-  <TabItem value="azure" label="Azure">
-
-  ```shell
-  curl https://b2w6emmkzp.us-east-1.awsapprunner.com/v1/chat/completions \
-    -H "Content-Type: application/json" \
-    -d '{
-      "model": "azure/<your-deployment-name>",
-      "messages": [{"role": "user", "content": "Say this is a test!"}],
-      "temperature": 0.7
-    }'
-  ```
-
-  </TabItem>
-
-  <TabItem value="anthropic" label="Anthropic">
-
-  ```shell
-  curl https://b2w6emmkzp.us-east-1.awsapprunner.com/v1/chat/completions \
-    -H "Content-Type: application/json" \
-    -d '{
-      "model": "claude-2",
-      "messages": [{"role": "user", "content": "Say this is a test!"}],
-      "temperature": 0.7,
-    }'
-  ```
-  </TabItem>
-
-  </Tabs>
-
-</TabItem>
-</Tabs>
-
 ## Advanced
 ### Caching - Completion() and Embedding() Responses