Merge branch 'main' into litellm_fwd_vtx_sdk_headers

2024-08-29 17:24:31 -07:00 · 2024-08-29 17:24:31 -07:00 · 010f526226
commit 010f526226
parent 748cc80783 601945d114
5 changed files with 366 additions and 8 deletions
--- a/docs/my-website/docs/index.md
+++ b/docs/my-website/docs/index.md
@ -399,6 +399,8 @@ The proxy provides:

 ### 📖 Proxy Endpoints - [Swagger Docs](https://litellm-api.up.railway.app/)

+Go here for a complete tutorial with keys + rate limits - [**here**](./proxy/docker_quick_start.md)
+
 ### Quick Start Proxy - CLI

 ```shell
@ -433,4 +435,5 @@ print(response)

 - [exception mapping](./exception_mapping.md)
 - [retries + model fallbacks for completion()](./completion/reliable_completions.md)
- [proxy virtual keys & spend management](./tutorials/fallbacks.md)
+- [proxy virtual keys & spend management](./proxy/virtual_keys.md)
+- [E2E Tutorial for LiteLLM Proxy Server](./proxy/docker_quick_start.md)
--- a/docs/my-website/docs/proxy/docker_quick_start.md
+++ b/docs/my-website/docs/proxy/docker_quick_start.md
@ -0,0 +1,355 @@
+# Getting Started - E2E Tutorial
+
+End-to-End tutorial for LiteLLM Proxy to:
+- Add an Azure OpenAI model 
+- Make a successful /chat/completion call 
+- Generate a virtual key 
+- Set RPM limit on virtual key 
+
+
+## Pre-Requisites 
+
+- Install LiteLLM Docker Image 
+
+```
+docker pull ghcr.io/berriai/litellm:main-latest
+```
+
+[**See all docker images**](https://github.com/orgs/BerriAI/packages)
+
+## 1. Add a model 
+
+Control LiteLLM Proxy with a config.yaml file.
+
+Setup your config.yaml with your azure model.
+
+```yaml
+model_list:
+  - model_name: gpt-3.5-turbo
+    litellm_params:
+      model: azure/my_azure_deployment
+      api_base: os.environ/AZURE_API_BASE
+      api_key: "os.environ/AZURE_API_KEY"
+      api_version: "2024-07-01-preview" # [OPTIONAL] litellm uses the latest azure api_version by default
+```
+---
+
+### Model List Specification
+
+- **`model_name`** (`str`) - This field should contain the name of the model as received.
+- **`litellm_params`** (`dict`) [See All LiteLLM Params](https://github.com/BerriAI/litellm/blob/559a6ad826b5daef41565f54f06c739c8c068b28/litellm/types/router.py#L222)
+    - **`model`** (`str`) - Specifies the model name to be sent to `litellm.acompletion` / `litellm.aembedding`, etc. This is the identifier used by LiteLLM to route to the correct model + provider logic on the backend. 
+    - **`api_key`** (`str`) - The API key required for authentication. It can be retrieved from an environment variable using `os.environ/`.
+    - **`api_base`** (`str`) - The API base for your azure deployment.
+    - **`api_version`** (`str`) - The API Version to use when calling Azure's OpenAI API. Get the latest Inference API version [here](https://learn.microsoft.com/en-us/azure/ai-services/openai/api-version-deprecation?source=recommendations#latest-preview-api-releases).
+
+
+### Useful Links
+- [**All Supported LLM API Providers (OpenAI/Bedrock/Vertex/etc.)**](../providers/)
+- [**Full Config.Yaml Spec**](./configs.md)
+- [**Pass provider-specific params**](../completion/provider_specific_params.md#proxy-usage)
+
+
+## 2. Make a successful /chat/completion call 
+
+LiteLLM Proxy is 100% OpenAI-compatible. Test your azure model via the `/chat/completions` route.
+
+### 2.1 Start Proxy 
+
+Save your config.yaml from step 1. as `litellm_config.yaml`.
+
+```bash
+docker run \
+    -v $(pwd)/litellm_config.yaml:/app/config.yaml \
+    -e AZURE_API_KEY=d6*********** \
+    -e AZURE_API_BASE=https://openai-***********/ \
+    -p 4000:4000 \
+    ghcr.io/berriai/litellm:main-latest \
+    --config /app/config.yaml --detailed_debug
+
+# RUNNING on http://0.0.0.0:4000
+```
+
+Confirm your config.yaml got mounted correctly
+
+```bash
+Loaded config YAML (api_key and environment_variables are not shown):
+{
+"model_list": [
+{
+"model_name ...
+```
+
+### 2.2 Make Call 
+
+
+```bash
+curl -X POST 'http://0.0.0.0:4000/chat/completions' \
+-H 'Content-Type: application/json' \
+-H 'Authorization: Bearer sk-1234' \
+-d '{
+    "model": "gpt-3.5-turbo",
+    "messages": [
+      {
+        "role": "system",
+        "content": "You are a helpful math tutor. Guide the user through the solution step by step."
+      },
+      {
+        "role": "user",
+        "content": "how can I solve 8x + 7 = -23"
+      }
+    ]
+}'
+```
+
+**Expected Response**
+
+```bash
+{
+    "id": "chatcmpl-2076f062-3095-4052-a520-7c321c115c68",
+    "choices": [
+        {
+            "finish_reason": "stop",
+            "index": 0,
+            "message": {
+                "content": "I am gpt-3.5-turbo",
+                "role": "assistant",
+                "tool_calls": null,
+                "function_call": null
+            }
+        }
+    ],
+    "created": 1724962831,
+    "model": "gpt-3.5-turbo",
+    "object": "chat.completion",
+    "system_fingerprint": null,
+    "usage": {
+        "completion_tokens": 20,
+        "prompt_tokens": 10,
+        "total_tokens": 30
+    }
+}
+```
+
+
+
+### Useful Links
+- [All Supported LLM API Providers (OpenAI/Bedrock/Vertex/etc.)](../providers/)
+- [Call LiteLLM Proxy via OpenAI SDK, Langchain, etc.](./user_keys.md#request-format)
+- [All API Endpoints Swagger](https://litellm-api.up.railway.app/#/chat%2Fcompletions)
+- [Other/Non-Chat Completion Endpoints](../embedding/supported_embedding.md)
+- [Pass-through for VertexAI, Bedrock, etc.](../pass_through/vertex_ai.md)
+
+## 3. Generate a virtual key
+
+Track Spend, and control model access via virtual keys for the proxy
+
+### 3.1 Set up a Database 
+
+**Requirements**
+- Need a postgres database (e.g. [Supabase](https://supabase.com/), [Neon](https://neon.tech/), etc)
+
+
+```yaml
+model_list:
+  - model_name: gpt-3.5-turbo
+    litellm_params:
+      model: azure/my_azure_deployment
+      api_base: os.environ/AZURE_API_BASE
+      api_key: "os.environ/AZURE_API_KEY"
+      api_version: "2024-07-01-preview" # [OPTIONAL] litellm uses the latest azure api_version by default
+
+general_settings: 
+  master_key: sk-1234 
+  database_url: "postgresql://<user>:<password>@<host>:<port>/<dbname>" # 👈 KEY CHANGE
+```
+
+Save config.yaml as `litellm_config.yaml` (used in 3.2).
+
+---
+
+**What is `general_settings`?**
+
+These are settings for the LiteLLM Proxy Server. 
+
+See All General Settings [here](http://localhost:3000/docs/proxy/configs#all-settings).
+
+1. **`master_key`** (`str`)
+   - **Description**: 
+     - Set a `master key`, this is your Proxy Admin key - you can use this to create other keys (🚨 must start with `sk-`).
+   - **Usage**: 
+     - ** Set on config.yaml** set your master key under `general_settings:master_key`, example - 
+        `master_key: sk-1234`
+     - ** Set env variable** set `LITELLM_MASTER_KEY`
+
+2. **`database_url`** (str)
+   - **Description**: 
+     - Set a `database_url`, this is the connection to your Postgres DB, which is used by litellm for generating keys, users, teams.
+   - **Usage**: 
+     - ** Set on config.yaml** set your master key under `general_settings:database_url`, example - 
+        `database_url: "postgresql://..."`
+     - Set `DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname>` in your env 
+
+### 3.2 Start Proxy 
+
+```bash
+docker run \
+    -v $(pwd)/litellm_config.yaml:/app/config.yaml \
+    -e AZURE_API_KEY=d6*********** \
+    -e AZURE_API_BASE=https://openai-***********/ \
+    -p 4000:4000 \
+    ghcr.io/berriai/litellm:main-latest \
+    --config /app/config.yaml --detailed_debug
+```
+
+
+### 3.3 Create Key w/ RPM Limit
+
+Create a key with `rpm_limit: 1`. This will only allow 1 request per minute for calls to proxy with this key.
+
+```bash 
+curl -L -X POST 'http://0.0.0.0:4000/key/generate' \
+-H 'Authorization: Bearer sk-1234' \
+-H 'Content-Type: application/json' \
+-d '{
+    "rpm_limit": 1
+}
+```
+
+[**See full API Spec**](https://litellm-api.up.railway.app/#/key%20management/generate_key_fn_key_generate_post)
+
+**Expected Response**
+
+```bash
+{
+    "key": "sk-12..."
+}
+```
+
+### 3.4 Test it! 
+
+**Use your virtual key from step 3.3**
+
+1st call - Expect to work! 
+
+```bash
+curl -X POST 'http://0.0.0.0:4000/chat/completions' \
+-H 'Content-Type: application/json' \
+-H 'Authorization: Bearer sk-12...' \
+-d '{
+    "model": "gpt-3.5-turbo",
+    "messages": [
+      {
+        "role": "system",
+        "content": "You are a helpful math tutor. Guide the user through the solution step by step."
+      },
+      {
+        "role": "user",
+        "content": "how can I solve 8x + 7 = -23"
+      }
+    ]
+}'
+```
+
+**Expected Response**
+
+```bash
+{
+    "id": "chatcmpl-2076f062-3095-4052-a520-7c321c115c68",
+    "choices": [
+        ...
+}
+```
+
+2nd call - Expect to fail! 
+
+**Why did this call fail?**
+
+We set the virtual key's requests per minute (RPM) limit to 1. This has now been crossed.
+
+
+```bash
+curl -X POST 'http://0.0.0.0:4000/chat/completions' \
+-H 'Content-Type: application/json' \
+-H 'Authorization: Bearer sk-12...' \
+-d '{
+    "model": "gpt-3.5-turbo",
+    "messages": [
+      {
+        "role": "system",
+        "content": "You are a helpful math tutor. Guide the user through the solution step by step."
+      },
+      {
+        "role": "user",
+        "content": "how can I solve 8x + 7 = -23"
+      }
+    ]
+}'
+```
+
+**Expected Response**
+
+```bash
+{
+  "error": {
+    "message": "Max parallel request limit reached. Hit limit for api_key: daa1b272072a4c6841470a488c5dad0f298ff506e1cc935f4a181eed90c182ad. tpm_limit: 100, current_tpm: 29, rpm_limit: 1, current_rpm: 2.",
+    "type": "None",
+    "param": "None",
+    "code": "429"
+  }
+}
+```
+
+### Useful Links 
+
+- [Creating Virtual Keys](./virtual_keys.md)
+- [Key Management API Endpoints Swagger](https://litellm-api.up.railway.app/#/key%20management)
+- [Set Budgets / Rate Limits per key/user/teams](./users.md)
+- [Dynamic TPM/RPM Limits for keys](./team_budgets.md#dynamic-tpmrpm-allocation)
+
+
+## Troubleshooting 
+
+### Non-root docker image?
+
+If you need to run the docker image as a non-root user, use [this](https://github.com/BerriAI/litellm/pkgs/container/litellm-non_root).
+
+### SSL Verification Issue
+
+If you see 
+
+```bash
+ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1006)
+```
+
+You can disable ssl verification with: 
+
+```yaml
+model_list:
+  - model_name: gpt-3.5-turbo
+    litellm_params:
+      model: azure/my_azure_deployment
+      api_base: os.environ/AZURE_API_BASE
+      api_key: "os.environ/AZURE_API_KEY"
+      api_version: "2024-07-01-preview"
+
+litellm_settings:
+    ssl_verify: false # 👈 KEY CHANGE
+```
+
+**What is `litellm_settings`?**
+
+LiteLLM Proxy uses the [LiteLLM Python SDK](https://docs.litellm.ai/docs/routing) for handling LLM API calls. 
+
+`litellm_settings` are module-level params for the LiteLLM Python SDK (equivalent to doing `litellm.<some_param>` on the SDK). You can see all params [here](https://github.com/BerriAI/litellm/blob/208fe6cb90937f73e0def5c97ccb2359bf8a467b/litellm/__init__.py#L114)
+
+## Support & Talk with founders
+
+- [Schedule Demo 👋](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version)
+
+- [Community Discord 💭](https://discord.gg/wuPM9dRgDw)
+
+- Our emails ✉️ ishaan@berri.ai / krrish@berri.ai
+
+[![Chat on WhatsApp](https://img.shields.io/static/v1?label=Chat%20on&message=WhatsApp&color=success&logo=WhatsApp&style=flat-square)](https://wa.link/huol9n) [![Chat on Discord](https://img.shields.io/static/v1?label=Chat%20on&message=Discord&color=blue&logo=Discord&style=flat-square)](https://discord.gg/wuPM9dRgDw) 
+
--- a/docs/my-website/docs/proxy/reliability.md
+++ b/docs/my-website/docs/proxy/reliability.md
@ -283,7 +283,6 @@ litellm_settings:

 **Covers all errors (429, 500, etc.)**

-
 **Set via config**
 ```yaml
 model_list:
--- a/docs/my-website/sidebars.js
+++ b/docs/my-website/sidebars.js
@ -29,6 +29,7 @@ const sidebars = {
      },
      items: [
        "proxy/quick_start",
+        "proxy/docker_quick_start",
        "proxy/deploy", 
        "proxy/prod", 
        {
--- a/litellm/proxy/_new_secret_config.yaml
+++ b/litellm/proxy/_new_secret_config.yaml
@ -1,9 +1,9 @@
 model_list:
-  - model_name: fake-openai-endpoint
+  - model_name: my-fake-openai-endpoint
    litellm_params:
-      model: openai/my-fake-model
-      api_key: my-fake-key
-      api_base: https://exampleopenaiendpoint-production.up.railway.app/
-
-
+      model: gpt-3.5-turbo
+      api_key: "my-fake-key"
+      mock_response: "hello-world"

+litellm_settings:
+  ssl_verify: false