forked from phoenix/litellm-mirror
Update README.md
This commit is contained in:
parent
1262d89ab3
commit
0e08a0082b
1 changed files with 26 additions and 29 deletions
55
README.md
55
README.md
|
@ -30,6 +30,32 @@ LiteLLM manages
|
|||
- Exception mapping - common exceptions across providers are mapped to the OpenAI exception types.
|
||||
- Load-balance across multiple deployments (e.g. Azure/OpenAI) - `Router` **1k+ requests/second**
|
||||
|
||||
# OpenAI Proxy - ([Docs](https://docs.litellm.ai/docs/simple_proxy))
|
||||
|
||||
Track spend across multiple projects/people.
|
||||
|
||||
### Step 1: Start litellm proxy
|
||||
```shell
|
||||
$ litellm --model huggingface/bigcode/starcoder
|
||||
|
||||
#INFO: Proxy running on http://0.0.0.0:8000
|
||||
```
|
||||
|
||||
### Step 2: Replace openai base
|
||||
```python
|
||||
import openai # openai v1.0.0+
|
||||
client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url
|
||||
# request sent to model set on litellm proxy, `litellm --model`
|
||||
response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
|
||||
{
|
||||
"role": "user",
|
||||
"content": "this is a test request, write a short poem"
|
||||
}
|
||||
])
|
||||
|
||||
print(response)
|
||||
```
|
||||
|
||||
# Usage ([**Docs**](https://docs.litellm.ai/docs/))
|
||||
|
||||
> [!IMPORTANT]
|
||||
|
@ -93,35 +119,6 @@ for part in response:
|
|||
print(part.choices[0].delta.content or "")
|
||||
```
|
||||
|
||||
## OpenAI Proxy - ([Docs](https://docs.litellm.ai/docs/simple_proxy))
|
||||
|
||||
LiteLLM Proxy manages:
|
||||
* Calling 100+ LLMs Huggingface/Bedrock/TogetherAI/etc. in the OpenAI ChatCompletions & Completions format
|
||||
* Load balancing - between Multiple Models + Deployments of the same model LiteLLM proxy can handle 1k+ requests/second during load tests
|
||||
* Authentication & Spend Tracking Virtual Keys
|
||||
|
||||
### Step 1: Start litellm proxy
|
||||
```shell
|
||||
$ litellm --model huggingface/bigcode/starcoder
|
||||
|
||||
#INFO: Proxy running on http://0.0.0.0:8000
|
||||
```
|
||||
|
||||
### Step 2: Replace openai base
|
||||
```python
|
||||
import openai # openai v1.0.0+
|
||||
client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url
|
||||
# request sent to model set on litellm proxy, `litellm --model`
|
||||
response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
|
||||
{
|
||||
"role": "user",
|
||||
"content": "this is a test request, write a short poem"
|
||||
}
|
||||
])
|
||||
|
||||
print(response)
|
||||
```
|
||||
|
||||
## Logging Observability ([Docs](https://docs.litellm.ai/docs/observability/callbacks))
|
||||
LiteLLM exposes pre defined callbacks to send data to Langfuse, LLMonitor, Helicone, Promptlayer, Traceloop, Slack
|
||||
```python
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue