fix root of docs page

This commit is contained in:
Ishaan Jaff 2024-09-19 14:36:21 -07:00
parent 7e30bcc128
commit 1e7839377c

View file

@ -12,7 +12,41 @@ https://github.com/BerriAI/litellm
- Retry/fallback logic across multiple deployments (e.g. Azure/OpenAI) - [Router](https://docs.litellm.ai/docs/routing) - Retry/fallback logic across multiple deployments (e.g. Azure/OpenAI) - [Router](https://docs.litellm.ai/docs/routing)
- Track spend & set budgets per project [LiteLLM Proxy Server](https://docs.litellm.ai/docs/simple_proxy) - Track spend & set budgets per project [LiteLLM Proxy Server](https://docs.litellm.ai/docs/simple_proxy)
## Basic usage ## How to use LiteLLM
You can use litellm through either:
1. [LiteLLM Proxy Server](#litellm-proxy-server-llm-gateway) - Server (LLM Gateway) to call 100+ LLMs, load balance, cost tracking across projects
2. [LiteLLM python SDK](#basic-usage) - Python Client to call 100+ LLMs, load balance, cost tracking
### **When to use LiteLLM Proxy Server (LLM Gateway)**
:::tip
Use LiteLLM Proxy Server if you want a **central service (LLM Gateway) to access multiple LLMs**
Typically used by Gen AI Enablement / ML PLatform Teams
:::
- LiteLLM Proxy gives you a unified interface to access multiple LLMs (100+ LLMs)
- Track LLM Usage and setup guardrails
- Customize Logging, Guardrails, Caching per project
### **When to use LiteLLM Python SDK**
:::tip
Use LiteLLM Python SDK if you want to use LiteLLM in your **python code**
Typically used by developers building llm projects
:::
- LiteLLM SDK gives you a unified interface to access multiple LLMs (100+ LLMs)
- Retry/fallback logic across multiple deployments (e.g. Azure/OpenAI) - [Router](https://docs.litellm.ai/docs/routing)
## **LiteLLM Python SDK**
### Basic usage
<a target="_blank" href="https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/liteLLM_Getting_Started.ipynb"> <a target="_blank" href="https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/liteLLM_Getting_Started.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
@ -146,9 +180,9 @@ response = completion(
</Tabs> </Tabs>
## Streaming ### Streaming
Set `stream=True` in the `completion` args.
Set `stream=True` in the `completion` args.
<Tabs> <Tabs>
<TabItem value="openai" label="OpenAI"> <TabItem value="openai" label="OpenAI">
@ -280,7 +314,7 @@ response = completion(
</Tabs> </Tabs>
## Exception handling ### Exception handling
LiteLLM maps exceptions across all supported providers to the OpenAI exceptions. All our exceptions inherit from OpenAI's exception types, so any error-handling you have for that, should work out of the box with LiteLLM. LiteLLM maps exceptions across all supported providers to the OpenAI exceptions. All our exceptions inherit from OpenAI's exception types, so any error-handling you have for that, should work out of the box with LiteLLM.
@ -296,8 +330,7 @@ except OpenAIError as e:
print(e) print(e)
``` ```
## Logging Observability - Log LLM Input/Output ([Docs](https://docs.litellm.ai/docs/observability/callbacks)) ### Logging Observability - Log LLM Input/Output ([Docs](https://docs.litellm.ai/docs/observability/callbacks))
LiteLLM exposes pre defined callbacks to send data to Lunary, Langfuse, Helicone, Promptlayer, Traceloop, Slack LiteLLM exposes pre defined callbacks to send data to Lunary, Langfuse, Helicone, Promptlayer, Traceloop, Slack
```python ```python
@ -312,14 +345,13 @@ os.environ["LUNARY_PUBLIC_KEY"] = "your-lunary-public-key"
os.environ["OPENAI_API_KEY"] os.environ["OPENAI_API_KEY"]
# set callbacks # set callbacks
litellm.success_callback = ["langfuse", "lunary"] # log input/output to lunary, langfuse, supabase litellm.success_callback = ["lunary", "langfuse", "helicone"] # log input/output to lunary, langfuse, supabase, helicone
#openai call #openai call
response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}]) response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}])
``` ```
## Track Costs, Usage, Latency for streaming ### Track Costs, Usage, Latency for streaming
Use a callback function for this - more info on custom callbacks: https://docs.litellm.ai/docs/observability/custom_callback Use a callback function for this - more info on custom callbacks: https://docs.litellm.ai/docs/observability/custom_callback
```python ```python
@ -352,7 +384,7 @@ response = completion(
) )
``` ```
## OpenAI Proxy ## **LiteLLM Proxy Server (LLM Gateway)**
Track spend across multiple projects/people Track spend across multiple projects/people
@ -367,6 +399,8 @@ The proxy provides:
### 📖 Proxy Endpoints - [Swagger Docs](https://litellm-api.up.railway.app/) ### 📖 Proxy Endpoints - [Swagger Docs](https://litellm-api.up.railway.app/)
Go here for a complete tutorial with keys + rate limits - [**here**](./proxy/docker_quick_start.md)
### Quick Start Proxy - CLI ### Quick Start Proxy - CLI
```shell ```shell
@ -378,14 +412,14 @@ pip install 'litellm[proxy]'
```shell ```shell
$ litellm --model huggingface/bigcode/starcoder $ litellm --model huggingface/bigcode/starcoder
#INFO: Proxy running on http://0.0.0.0:8000 #INFO: Proxy running on http://0.0.0.0:4000
``` ```
#### Step 2: Make ChatCompletions Request to Proxy #### Step 2: Make ChatCompletions Request to Proxy
```python ```python
import openai # openai v1.0.0+ import openai # openai v1.0.0+
client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:4000") # set proxy to base_url
# request sent to model set on litellm proxy, `litellm --model` # request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [ response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
{ {
@ -401,4 +435,5 @@ print(response)
- [exception mapping](./exception_mapping.md) - [exception mapping](./exception_mapping.md)
- [retries + model fallbacks for completion()](./completion/reliable_completions.md) - [retries + model fallbacks for completion()](./completion/reliable_completions.md)
- [proxy virtual keys & spend management](./tutorials/fallbacks.md) - [proxy virtual keys & spend management](./proxy/virtual_keys.md)
- [E2E Tutorial for LiteLLM Proxy Server](./proxy/docker_quick_start.md)