diff --git a/docs/my-website/src/pages/index.md b/docs/my-website/src/pages/index.md index d074fd376..291c51ab0 100644 --- a/docs/my-website/src/pages/index.md +++ b/docs/my-website/src/pages/index.md @@ -12,7 +12,41 @@ https://github.com/BerriAI/litellm - Retry/fallback logic across multiple deployments (e.g. Azure/OpenAI) - [Router](https://docs.litellm.ai/docs/routing) - Track spend & set budgets per project [LiteLLM Proxy Server](https://docs.litellm.ai/docs/simple_proxy) -## Basic usage +## How to use LiteLLM +You can use litellm through either: +1. [LiteLLM Proxy Server](#litellm-proxy-server-llm-gateway) - Server (LLM Gateway) to call 100+ LLMs, load balance, cost tracking across projects +2. [LiteLLM python SDK](#basic-usage) - Python Client to call 100+ LLMs, load balance, cost tracking + +### **When to use LiteLLM Proxy Server (LLM Gateway)** + +:::tip + +Use LiteLLM Proxy Server if you want a **central service (LLM Gateway) to access multiple LLMs** + +Typically used by Gen AI Enablement / ML PLatform Teams + +::: + + - LiteLLM Proxy gives you a unified interface to access multiple LLMs (100+ LLMs) + - Track LLM Usage and setup guardrails + - Customize Logging, Guardrails, Caching per project + +### **When to use LiteLLM Python SDK** + +:::tip + + Use LiteLLM Python SDK if you want to use LiteLLM in your **python code** + +Typically used by developers building llm projects + +::: + + - LiteLLM SDK gives you a unified interface to access multiple LLMs (100+ LLMs) + - Retry/fallback logic across multiple deployments (e.g. Azure/OpenAI) - [Router](https://docs.litellm.ai/docs/routing) + +## **LiteLLM Python SDK** + +### Basic usage Open In Colab @@ -146,9 +180,9 @@ response = completion( -## Streaming +### Streaming +Set `stream=True` in the `completion` args. -Set `stream=True` in the `completion` args. @@ -280,7 +314,7 @@ response = completion( -## Exception handling +### Exception handling LiteLLM maps exceptions across all supported providers to the OpenAI exceptions. All our exceptions inherit from OpenAI's exception types, so any error-handling you have for that, should work out of the box with LiteLLM. @@ -296,8 +330,7 @@ except OpenAIError as e: print(e) ``` -## Logging Observability - Log LLM Input/Output ([Docs](https://docs.litellm.ai/docs/observability/callbacks)) - +### Logging Observability - Log LLM Input/Output ([Docs](https://docs.litellm.ai/docs/observability/callbacks)) LiteLLM exposes pre defined callbacks to send data to Lunary, Langfuse, Helicone, Promptlayer, Traceloop, Slack ```python @@ -312,14 +345,13 @@ os.environ["LUNARY_PUBLIC_KEY"] = "your-lunary-public-key" os.environ["OPENAI_API_KEY"] # set callbacks -litellm.success_callback = ["langfuse", "lunary"] # log input/output to lunary, langfuse, supabase +litellm.success_callback = ["lunary", "langfuse", "helicone"] # log input/output to lunary, langfuse, supabase, helicone #openai call response = completion(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hi 👋 - i'm openai"}]) ``` -## Track Costs, Usage, Latency for streaming - +### Track Costs, Usage, Latency for streaming Use a callback function for this - more info on custom callbacks: https://docs.litellm.ai/docs/observability/custom_callback ```python @@ -352,7 +384,7 @@ response = completion( ) ``` -## OpenAI Proxy +## **LiteLLM Proxy Server (LLM Gateway)** Track spend across multiple projects/people @@ -367,6 +399,8 @@ The proxy provides: ### 📖 Proxy Endpoints - [Swagger Docs](https://litellm-api.up.railway.app/) +Go here for a complete tutorial with keys + rate limits - [**here**](./proxy/docker_quick_start.md) + ### Quick Start Proxy - CLI ```shell @@ -378,14 +412,14 @@ pip install 'litellm[proxy]' ```shell $ litellm --model huggingface/bigcode/starcoder -#INFO: Proxy running on http://0.0.0.0:8000 +#INFO: Proxy running on http://0.0.0.0:4000 ``` #### Step 2: Make ChatCompletions Request to Proxy ```python import openai # openai v1.0.0+ -client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url +client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:4000") # set proxy to base_url # request sent to model set on litellm proxy, `litellm --model` response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [ { @@ -401,4 +435,5 @@ print(response) - [exception mapping](./exception_mapping.md) - [retries + model fallbacks for completion()](./completion/reliable_completions.md) -- [proxy virtual keys & spend management](./tutorials/fallbacks.md) +- [proxy virtual keys & spend management](./proxy/virtual_keys.md) +- [E2E Tutorial for LiteLLM Proxy Server](./proxy/docker_quick_start.md)