Update README.md

2023-12-25 06:09:55 +05:30 · 2023-12-25 06:09:55 +05:30 · 0e08a0082b
commit 0e08a0082b
parent 1262d89ab3
1 changed files with 26 additions and 29 deletions
--- a/README.md
+++ b/README.md
@ -30,6 +30,32 @@ LiteLLM manages
 - Exception mapping - common exceptions across providers are mapped to the OpenAI exception types.
 - Load-balance across multiple deployments (e.g. Azure/OpenAI) - `Router` **1k+ requests/second**

+# OpenAI Proxy - ([Docs](https://docs.litellm.ai/docs/simple_proxy))
+
+Track spend across multiple projects/people. 
+
+### Step 1: Start litellm proxy
+```shell
+$ litellm --model huggingface/bigcode/starcoder
+
+#INFO: Proxy running on http://0.0.0.0:8000
+```
+
+### Step 2: Replace openai base
+```python
+import openai # openai v1.0.0+
+client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url
+# request sent to model set on litellm proxy, `litellm --model`
+response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
+    {
+        "role": "user",
+        "content": "this is a test request, write a short poem"
+    }
+])
+
+print(response)
+```
+
 # Usage ([**Docs**](https://docs.litellm.ai/docs/))

 > [!IMPORTANT]
@ -93,35 +119,6 @@ for part in response:
    print(part.choices[0].delta.content or "")
 ```

-## OpenAI Proxy - ([Docs](https://docs.litellm.ai/docs/simple_proxy))
-
-LiteLLM Proxy manages:
-* Calling 100+ LLMs Huggingface/Bedrock/TogetherAI/etc. in the OpenAI ChatCompletions & Completions format
-* Load balancing - between Multiple Models + Deployments of the same model LiteLLM proxy can handle 1k+ requests/second during load tests
-* Authentication & Spend Tracking Virtual Keys
-
-### Step 1: Start litellm proxy
-```shell
-$ litellm --model huggingface/bigcode/starcoder
-
-#INFO: Proxy running on http://0.0.0.0:8000
-```
-
-### Step 2: Replace openai base
-```python
-import openai # openai v1.0.0+
-client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url
-# request sent to model set on litellm proxy, `litellm --model`
-response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
-    {
-        "role": "user",
-        "content": "this is a test request, write a short poem"
-    }
-])
-
-print(response)
-```
-
 ## Logging Observability ([Docs](https://docs.litellm.ai/docs/observability/callbacks))
 LiteLLM exposes pre defined callbacks to send data to Langfuse, LLMonitor, Helicone, Promptlayer, Traceloop, Slack
 ```python