forked from phoenix/litellm-mirror
Update README.md
This commit is contained in:
parent
1262d89ab3
commit
0e08a0082b
1 changed files with 26 additions and 29 deletions
55
README.md
55
README.md
|
@ -30,6 +30,32 @@ LiteLLM manages
|
||||||
- Exception mapping - common exceptions across providers are mapped to the OpenAI exception types.
|
- Exception mapping - common exceptions across providers are mapped to the OpenAI exception types.
|
||||||
- Load-balance across multiple deployments (e.g. Azure/OpenAI) - `Router` **1k+ requests/second**
|
- Load-balance across multiple deployments (e.g. Azure/OpenAI) - `Router` **1k+ requests/second**
|
||||||
|
|
||||||
|
# OpenAI Proxy - ([Docs](https://docs.litellm.ai/docs/simple_proxy))
|
||||||
|
|
||||||
|
Track spend across multiple projects/people.
|
||||||
|
|
||||||
|
### Step 1: Start litellm proxy
|
||||||
|
```shell
|
||||||
|
$ litellm --model huggingface/bigcode/starcoder
|
||||||
|
|
||||||
|
#INFO: Proxy running on http://0.0.0.0:8000
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2: Replace openai base
|
||||||
|
```python
|
||||||
|
import openai # openai v1.0.0+
|
||||||
|
client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url
|
||||||
|
# request sent to model set on litellm proxy, `litellm --model`
|
||||||
|
response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
|
||||||
|
{
|
||||||
|
"role": "user",
|
||||||
|
"content": "this is a test request, write a short poem"
|
||||||
|
}
|
||||||
|
])
|
||||||
|
|
||||||
|
print(response)
|
||||||
|
```
|
||||||
|
|
||||||
# Usage ([**Docs**](https://docs.litellm.ai/docs/))
|
# Usage ([**Docs**](https://docs.litellm.ai/docs/))
|
||||||
|
|
||||||
> [!IMPORTANT]
|
> [!IMPORTANT]
|
||||||
|
@ -93,35 +119,6 @@ for part in response:
|
||||||
print(part.choices[0].delta.content or "")
|
print(part.choices[0].delta.content or "")
|
||||||
```
|
```
|
||||||
|
|
||||||
## OpenAI Proxy - ([Docs](https://docs.litellm.ai/docs/simple_proxy))
|
|
||||||
|
|
||||||
LiteLLM Proxy manages:
|
|
||||||
* Calling 100+ LLMs Huggingface/Bedrock/TogetherAI/etc. in the OpenAI ChatCompletions & Completions format
|
|
||||||
* Load balancing - between Multiple Models + Deployments of the same model LiteLLM proxy can handle 1k+ requests/second during load tests
|
|
||||||
* Authentication & Spend Tracking Virtual Keys
|
|
||||||
|
|
||||||
### Step 1: Start litellm proxy
|
|
||||||
```shell
|
|
||||||
$ litellm --model huggingface/bigcode/starcoder
|
|
||||||
|
|
||||||
#INFO: Proxy running on http://0.0.0.0:8000
|
|
||||||
```
|
|
||||||
|
|
||||||
### Step 2: Replace openai base
|
|
||||||
```python
|
|
||||||
import openai # openai v1.0.0+
|
|
||||||
client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url
|
|
||||||
# request sent to model set on litellm proxy, `litellm --model`
|
|
||||||
response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
|
|
||||||
{
|
|
||||||
"role": "user",
|
|
||||||
"content": "this is a test request, write a short poem"
|
|
||||||
}
|
|
||||||
])
|
|
||||||
|
|
||||||
print(response)
|
|
||||||
```
|
|
||||||
|
|
||||||
## Logging Observability ([Docs](https://docs.litellm.ai/docs/observability/callbacks))
|
## Logging Observability ([Docs](https://docs.litellm.ai/docs/observability/callbacks))
|
||||||
LiteLLM exposes pre defined callbacks to send data to Langfuse, LLMonitor, Helicone, Promptlayer, Traceloop, Slack
|
LiteLLM exposes pre defined callbacks to send data to Langfuse, LLMonitor, Helicone, Promptlayer, Traceloop, Slack
|
||||||
```python
|
```python
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue