From 0e08a0082bccbccd6ad25f93e6fed82668ab4e78 Mon Sep 17 00:00:00 2001 From: Krish Dholakia Date: Mon, 25 Dec 2023 06:09:55 +0530 Subject: [PATCH] Update README.md --- README.md | 55 ++++++++++++++++++++++++++----------------------------- 1 file changed, 26 insertions(+), 29 deletions(-) diff --git a/README.md b/README.md index 758ed171c..33e4d2137 100644 --- a/README.md +++ b/README.md @@ -30,6 +30,32 @@ LiteLLM manages - Exception mapping - common exceptions across providers are mapped to the OpenAI exception types. - Load-balance across multiple deployments (e.g. Azure/OpenAI) - `Router` **1k+ requests/second** +# OpenAI Proxy - ([Docs](https://docs.litellm.ai/docs/simple_proxy)) + +Track spend across multiple projects/people. + +### Step 1: Start litellm proxy +```shell +$ litellm --model huggingface/bigcode/starcoder + +#INFO: Proxy running on http://0.0.0.0:8000 +``` + +### Step 2: Replace openai base +```python +import openai # openai v1.0.0+ +client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url +# request sent to model set on litellm proxy, `litellm --model` +response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [ + { + "role": "user", + "content": "this is a test request, write a short poem" + } +]) + +print(response) +``` + # Usage ([**Docs**](https://docs.litellm.ai/docs/)) > [!IMPORTANT] @@ -93,35 +119,6 @@ for part in response: print(part.choices[0].delta.content or "") ``` -## OpenAI Proxy - ([Docs](https://docs.litellm.ai/docs/simple_proxy)) - -LiteLLM Proxy manages: -* Calling 100+ LLMs Huggingface/Bedrock/TogetherAI/etc. in the OpenAI ChatCompletions & Completions format -* Load balancing - between Multiple Models + Deployments of the same model LiteLLM proxy can handle 1k+ requests/second during load tests -* Authentication & Spend Tracking Virtual Keys - -### Step 1: Start litellm proxy -```shell -$ litellm --model huggingface/bigcode/starcoder - -#INFO: Proxy running on http://0.0.0.0:8000 -``` - -### Step 2: Replace openai base -```python -import openai # openai v1.0.0+ -client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url -# request sent to model set on litellm proxy, `litellm --model` -response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [ - { - "role": "user", - "content": "this is a test request, write a short poem" - } -]) - -print(response) -``` - ## Logging Observability ([Docs](https://docs.litellm.ai/docs/observability/callbacks)) LiteLLM exposes pre defined callbacks to send data to Langfuse, LLMonitor, Helicone, Promptlayer, Traceloop, Slack ```python