forked from phoenix/litellm-mirror
fix root of docs page
This commit is contained in:
parent
7e30bcc128
commit
1e7839377c
1 changed files with 48 additions and 13 deletions
|
@ -12,7 +12,41 @@ https://github.com/BerriAI/litellm
|
||||||
- Retry/fallback logic across multiple deployments (e.g. Azure/OpenAI) - [Router](https://docs.litellm.ai/docs/routing)
|
- Retry/fallback logic across multiple deployments (e.g. Azure/OpenAI) - [Router](https://docs.litellm.ai/docs/routing)
|
||||||
- Track spend & set budgets per project [LiteLLM Proxy Server](https://docs.litellm.ai/docs/simple_proxy)
|
- Track spend & set budgets per project [LiteLLM Proxy Server](https://docs.litellm.ai/docs/simple_proxy)
|
||||||
|
|
||||||
## Basic usage
|
## How to use LiteLLM
|
||||||
|
You can use litellm through either:
|
||||||
|
1. [LiteLLM Proxy Server](#litellm-proxy-server-llm-gateway) - Server (LLM Gateway) to call 100+ LLMs, load balance, cost tracking across projects
|
||||||
|
2. [LiteLLM python SDK](#basic-usage) - Python Client to call 100+ LLMs, load balance, cost tracking
|
||||||
|
|
||||||
|
### **When to use LiteLLM Proxy Server (LLM Gateway)**
|
||||||
|
|
||||||
|
:::tip
|
||||||
|
|
||||||
|
Use LiteLLM Proxy Server if you want a **central service (LLM Gateway) to access multiple LLMs**
|
||||||
|
|
||||||
|
Typically used by Gen AI Enablement / ML PLatform Teams
|
||||||
|
|
||||||
|
:::
|
||||||
|
|
||||||
|
- LiteLLM Proxy gives you a unified interface to access multiple LLMs (100+ LLMs)
|
||||||
|
- Track LLM Usage and setup guardrails
|
||||||
|
- Customize Logging, Guardrails, Caching per project
|
||||||
|
|
||||||
|
### **When to use LiteLLM Python SDK**
|
||||||
|
|
||||||
|
:::tip
|
||||||
|
|
||||||
|
Use LiteLLM Python SDK if you want to use LiteLLM in your **python code**
|
||||||
|
|
||||||
|
Typically used by developers building llm projects
|
||||||
|
|
||||||
|
:::
|
||||||
|
|
||||||
|
- LiteLLM SDK gives you a unified interface to access multiple LLMs (100+ LLMs)
|
||||||
|
- Retry/fallback logic across multiple deployments (e.g. Azure/OpenAI) - [Router](https://docs.litellm.ai/docs/routing)
|
||||||
|
|
||||||
|
## **LiteLLM Python SDK**
|
||||||
|
|
||||||
|
### Basic usage
|
||||||
|
|
||||||
<a target="_blank" href="https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/liteLLM_Getting_Started.ipynb">
|
<a target="_blank" href="https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/liteLLM_Getting_Started.ipynb">
|
||||||
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
||||||
|
@ -146,9 +180,9 @@ response = completion(
|
||||||
|
|
||||||
</Tabs>
|
</Tabs>
|
||||||
|
|
||||||
## Streaming
|
### Streaming
|
||||||
|
Set `stream=True` in the `completion` args.
|
||||||
|
|
||||||
Set `stream=True` in the `completion` args.
|
|
||||||
<Tabs>
|
<Tabs>
|
||||||
<TabItem value="openai" label="OpenAI">
|
<TabItem value="openai" label="OpenAI">
|
||||||
|
|
||||||
|
@ -280,7 +314,7 @@ response = completion(
|
||||||
|
|
||||||
</Tabs>
|
</Tabs>
|
||||||
|
|
||||||
## Exception handling
|
### Exception handling
|
||||||
|
|
||||||
LiteLLM maps exceptions across all supported providers to the OpenAI exceptions. All our exceptions inherit from OpenAI's exception types, so any error-handling you have for that, should work out of the box with LiteLLM.
|
LiteLLM maps exceptions across all supported providers to the OpenAI exceptions. All our exceptions inherit from OpenAI's exception types, so any error-handling you have for that, should work out of the box with LiteLLM.
|
||||||
|
|
||||||
|
@ -296,8 +330,7 @@ except OpenAIError as e:
|
||||||
print(e)
|
print(e)
|
||||||
```
|
```
|
||||||
|
|
||||||
## Logging Observability - Log LLM Input/Output ([Docs](https://docs.litellm.ai/docs/observability/callbacks))
|
### Logging Observability - Log LLM Input/Output ([Docs](https://docs.litellm.ai/docs/observability/callbacks))
|
||||||
|
|
||||||
LiteLLM exposes pre defined callbacks to send data to Lunary, Langfuse, Helicone, Promptlayer, Traceloop, Slack
|
LiteLLM exposes pre defined callbacks to send data to Lunary, Langfuse, Helicone, Promptlayer, Traceloop, Slack
|
||||||
|
|
||||||
```python
|
```python
|
||||||
|
@ -312,14 +345,13 @@ os.environ["LUNARY_PUBLIC_KEY"] = "your-lunary-public-key"
|
||||||
os.environ["OPENAI_API_KEY"]
|
os.environ["OPENAI_API_KEY"]
|
||||||
|
|
||||||
# set callbacks
|
# set callbacks
|
||||||
litellm.success_callback = ["langfuse", "lunary"] # log input/output to lunary, langfuse, supabase
|
litellm.success_callback = ["lunary", "langfuse", "helicone"] # log input/output to lunary, langfuse, supabase, helicone
|
||||||
|
|
||||||
#openai call
|
#openai call
|
||||||
response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}])
|
response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}])
|
||||||
```
|
```
|
||||||
|
|
||||||
## Track Costs, Usage, Latency for streaming
|
### Track Costs, Usage, Latency for streaming
|
||||||
|
|
||||||
Use a callback function for this - more info on custom callbacks: https://docs.litellm.ai/docs/observability/custom_callback
|
Use a callback function for this - more info on custom callbacks: https://docs.litellm.ai/docs/observability/custom_callback
|
||||||
|
|
||||||
```python
|
```python
|
||||||
|
@ -352,7 +384,7 @@ response = completion(
|
||||||
)
|
)
|
||||||
```
|
```
|
||||||
|
|
||||||
## OpenAI Proxy
|
## **LiteLLM Proxy Server (LLM Gateway)**
|
||||||
|
|
||||||
Track spend across multiple projects/people
|
Track spend across multiple projects/people
|
||||||
|
|
||||||
|
@ -367,6 +399,8 @@ The proxy provides:
|
||||||
|
|
||||||
### 📖 Proxy Endpoints - [Swagger Docs](https://litellm-api.up.railway.app/)
|
### 📖 Proxy Endpoints - [Swagger Docs](https://litellm-api.up.railway.app/)
|
||||||
|
|
||||||
|
Go here for a complete tutorial with keys + rate limits - [**here**](./proxy/docker_quick_start.md)
|
||||||
|
|
||||||
### Quick Start Proxy - CLI
|
### Quick Start Proxy - CLI
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
|
@ -378,14 +412,14 @@ pip install 'litellm[proxy]'
|
||||||
```shell
|
```shell
|
||||||
$ litellm --model huggingface/bigcode/starcoder
|
$ litellm --model huggingface/bigcode/starcoder
|
||||||
|
|
||||||
#INFO: Proxy running on http://0.0.0.0:8000
|
#INFO: Proxy running on http://0.0.0.0:4000
|
||||||
```
|
```
|
||||||
|
|
||||||
#### Step 2: Make ChatCompletions Request to Proxy
|
#### Step 2: Make ChatCompletions Request to Proxy
|
||||||
|
|
||||||
```python
|
```python
|
||||||
import openai # openai v1.0.0+
|
import openai # openai v1.0.0+
|
||||||
client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url
|
client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:4000") # set proxy to base_url
|
||||||
# request sent to model set on litellm proxy, `litellm --model`
|
# request sent to model set on litellm proxy, `litellm --model`
|
||||||
response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
|
response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
|
||||||
{
|
{
|
||||||
|
@ -401,4 +435,5 @@ print(response)
|
||||||
|
|
||||||
- [exception mapping](./exception_mapping.md)
|
- [exception mapping](./exception_mapping.md)
|
||||||
- [retries + model fallbacks for completion()](./completion/reliable_completions.md)
|
- [retries + model fallbacks for completion()](./completion/reliable_completions.md)
|
||||||
- [proxy virtual keys & spend management](./tutorials/fallbacks.md)
|
- [proxy virtual keys & spend management](./proxy/virtual_keys.md)
|
||||||
|
- [E2E Tutorial for LiteLLM Proxy Server](./proxy/docker_quick_start.md)
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue