Update README.md

2023-12-25 06:09:55 +05:30 · 2023-12-25 06:09:55 +05:30 · 0e08a0082b
commit 0e08a0082b
parent 1262d89ab3
1 changed files with 26 additions and 29 deletions
--- a/README.md
+++ b/README.md
@ -30,6 +30,32 @@ LiteLLM manages
 - Exception mapping - common exceptions across providers are mapped to the OpenAI exception types.
 - Load-balance across multiple deployments (e.g. Azure/OpenAI) - `Router` **1k+ requests/second**
 # OpenAI Proxy - ([Docs](https://docs.litellm.ai/docs/simple_proxy))
 Track spend across multiple projects/people. 
 ### Step 1: Start litellm proxy
 ```shell
 $ litellm --model huggingface/bigcode/starcoder
 #INFO: Proxy running on http://0.0.0.0:8000
 ```
 ### Step 2: Replace openai base
 ```python
 import openai # openai v1.0.0+
 client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url
 # request sent to model set on litellm proxy, `litellm --model`
 response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
    {
        "role": "user",
        "content": "this is a test request, write a short poem"
    }
 ])
 print(response)
 ```
 # Usage ([**Docs**](https://docs.litellm.ai/docs/))
 > [!IMPORTANT]
@ -93,35 +119,6 @@ for part in response:
    print(part.choices[0].delta.content or "")
 ```
 ## OpenAI Proxy - ([Docs](https://docs.litellm.ai/docs/simple_proxy))
 LiteLLM Proxy manages:
 * Calling 100+ LLMs Huggingface/Bedrock/TogetherAI/etc. in the OpenAI ChatCompletions & Completions format
 * Load balancing - between Multiple Models + Deployments of the same model LiteLLM proxy can handle 1k+ requests/second during load tests
 * Authentication & Spend Tracking Virtual Keys
 ### Step 1: Start litellm proxy
 ```shell
 $ litellm --model huggingface/bigcode/starcoder
 #INFO: Proxy running on http://0.0.0.0:8000
 ```
 ### Step 2: Replace openai base
 ```python
 import openai # openai v1.0.0+
 client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url
 # request sent to model set on litellm proxy, `litellm --model`
 response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
    {
        "role": "user",
        "content": "this is a test request, write a short poem"
    }
 ])
 print(response)
 ```
 ## Logging Observability ([Docs](https://docs.litellm.ai/docs/observability/callbacks))
 LiteLLM exposes pre defined callbacks to send data to Langfuse, LLMonitor, Helicone, Promptlayer, Traceloop, Slack
 ```python