diff --git a/docs/my-website/docs/tutorials/ab_test_llms.md b/docs/my-website/docs/tutorials/ab_test_llms.md deleted file mode 100644 index b08e91352..000000000 --- a/docs/my-website/docs/tutorials/ab_test_llms.md +++ /dev/null @@ -1,98 +0,0 @@ -import Image from '@theme/IdealImage'; - -# Split traffic betwen GPT-4 and Llama2 in Production! -In this tutorial, we'll walk through A/B testing between GPT-4 and Llama2 in production. We'll assume you've deployed Llama2 on Huggingface Inference Endpoints (but any of TogetherAI, Baseten, Ollama, Petals, Openrouter should work as well). - - -# Relevant Resources: - -* 🚀 [Your production dashboard!](https://admin.litellm.ai/) - - -* [Deploying models on Huggingface](https://huggingface.co/docs/inference-endpoints/guides/create_endpoint) -* [All supported providers on LiteLLM](https://docs.litellm.ai/docs/providers) - -# Code Walkthrough - -In production, we don't know if Llama2 is going to provide: -* good results -* quickly - -### 💡 Route 20% traffic to Llama2 -If Llama2 returns poor answers / is extremely slow, we want to roll-back this change, and use GPT-4 instead. - -Instead of routing 100% of our traffic to Llama2, let's **start by routing 20% traffic** to it and see how it does. - -```python -## route 20% of responses to Llama2 -split_per_model = { - "gpt-4": 0.8, - "huggingface/https://my-unique-endpoint.us-east-1.aws.endpoints.huggingface.cloud": 0.2 -} -``` - -## 👨‍💻 Complete Code - -### a) For Local -If we're testing this in a script - this is what our complete code looks like. -```python -from litellm import completion_with_split_tests -import os - -## set ENV variables -os.environ["OPENAI_API_KEY"] = "openai key" -os.environ["HUGGINGFACE_API_KEY"] = "huggingface key" - -## route 20% of responses to Llama2 -split_per_model = { - "gpt-4": 0.8, - "huggingface/https://my-unique-endpoint.us-east-1.aws.endpoints.huggingface.cloud": 0.2 -} - -messages = [{ "content": "Hello, how are you?","role": "user"}] - -completion_with_split_tests( - models=split_per_model, - messages=messages, -) -``` - -### b) For Production - -If we're in production, we don't want to keep going to code to change model/test details (prompt, split%, etc.) for our completion function and redeploying changes. - -LiteLLM exposes a client dashboard to do this in a UI - and instantly updates our completion function in prod. - -#### Relevant Code - -```python -completion_with_split_tests(..., use_client=True, id="my-unique-id") -``` - -#### Complete Code - -```python -from litellm import completion_with_split_tests -import os - -## set ENV variables -os.environ["OPENAI_API_KEY"] = "openai key" -os.environ["HUGGINGFACE_API_KEY"] = "huggingface key" - -## route 20% of responses to Llama2 -split_per_model = { - "gpt-4": 0.8, - "huggingface/https://my-unique-endpoint.us-east-1.aws.endpoints.huggingface.cloud": 0.2 -} - -messages = [{ "content": "Hello, how are you?","role": "user"}] - -completion_with_split_tests( - models=split_per_model, - messages=messages, - use_client=True, - id="my-unique-id" # Auto-create this @ https://admin.litellm.ai/ -) -``` - - diff --git a/docs/my-website/docs/tutorials/litellm_proxy_aporia.md b/docs/my-website/docs/tutorials/litellm_proxy_aporia.md new file mode 100644 index 000000000..cb837209a --- /dev/null +++ b/docs/my-website/docs/tutorials/litellm_proxy_aporia.md @@ -0,0 +1,163 @@ +import Image from '@theme/IdealImage'; +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + +# Use LiteLLM AI Gateway with Aporia Guardrails + +In this tutorial we will use LiteLLM Proxy with Aporia to detect PII in requests and profanity in responses + +## 1. Setup guardrails on Aporia + +### Create Aporia Projects + +Create two projects on [Aporia](https://guardrails.aporia.com/) + +1. Pre LLM API Call - Set all the policies you want to run on pre LLM API call +2. Post LLM API Call - Set all the policies you want to run post LLM API call + + + + + +### Pre-Call: Detect PII + +Add the `PII - Prompt` to your Pre LLM API Call project + + + +### Post-Call: Detect Profanity in Responses + +Add the `Toxicity - Response` to your Post LLM API Call project + + + + +## 2. Define Guardrails on your LiteLLM config.yaml + +```yaml +model_list: + - model_name: gpt-3.5-turbo + litellm_params: + model: openai/gpt-3.5-turbo + api_key: os.environ/OPENAI_API_KEY + +guardrails: + pre_call: # guardrail only runs on input before LLM API call + guardrail: "aporia" # supported values ["aporia", "bedrock", "lakera"] + api_key: os.environ/APORIA_API_KEY_1 + api_base: os.environ/APORIA_API_BASE_1 + post_call: # guardrail only runs on output after LLM API call + guardrail: "aporia" # supported values ["aporia", "bedrock", "lakera"] + api_key: os.environ/APORIA_API_KEY_2 + api_base: os.environ/APORIA_API_BASE_2 +``` + +## 3. Start LiteLLM Gateway + + +```shell +litellm --config config.yaml --detailed_debug +``` + +## 4. Test request + + + + +Expect this to fail since since `ishaan@berri.ai` in the request is PII + +```shell +curl -i http://localhost:4000/v1/chat/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer sk-1234" \ + -d '{ + "model": "gpt-3.5-turbo", + "messages": [ + {"role": "user", "content": "hi my email is ishaan@berri.ai"} + ] + }' +``` + + + + + +```shell +curl -i http://localhost:4000/v1/chat/completions \ + -H "Content-Type: application/json" \ + -H "Authorization: Bearer sk-1234" \ + -d '{ + "model": "gpt-3.5-turbo", + "messages": [ + {"role": "user", "content": "hi what is the weather?"} + ] + }' +``` + + + + + + +## Advanced +### Control Guardrails per Project (API Key) + +Use this to control what guardrail/s run per project. In this tutorial we only want the following guardrails to run for 1 project +- pre_call: aporia +- post_call: aporia + +**Step 1** Create Key with guardrail settings + + + + +```shell +curl -X POST 'http://0.0.0.0:4000/key/generate' \ + -H 'Authorization: Bearer sk-1234' \ + -H 'Content-Type: application/json' \ + -D '{ + "guardrails": { + "pre_call": ["aporia"], + "post_call": ["aporia"] + } + }' +``` + + + + +```shell +curl --location 'http://0.0.0.0:4000/key/update' \ + --header 'Authorization: Bearer sk-1234' \ + --header 'Content-Type: application/json' \ + --data '{ + "key": "sk-jNm1Zar7XfNdZXp49Z1kSQ", + "guardrails": { + "pre_call": ["aporia"], + "post_call": ["aporia"] + } +}' +``` + + + + +**Step 2** Test it with new key + +```shell +curl --location 'http://0.0.0.0:4000/chat/completions' \ + --header 'Authorization: Bearer sk-jNm1Zar7XfNdZXp49Z1kSQ' \ + --header 'Content-Type: application/json' \ + --data '{ + "model": "gpt-3.5-turbo", + "messages": [ + { + "role": "user", + "content": "my email is ishaan@berri.ai" + } + ] +}' +``` + + + diff --git a/docs/my-website/img/aporia_post.png b/docs/my-website/img/aporia_post.png new file mode 100644 index 000000000..5e4d4a287 Binary files /dev/null and b/docs/my-website/img/aporia_post.png differ diff --git a/docs/my-website/img/aporia_pre.png b/docs/my-website/img/aporia_pre.png new file mode 100644 index 000000000..8df1cfdda Binary files /dev/null and b/docs/my-website/img/aporia_pre.png differ diff --git a/docs/my-website/img/aporia_projs.png b/docs/my-website/img/aporia_projs.png new file mode 100644 index 000000000..c518fdf0b Binary files /dev/null and b/docs/my-website/img/aporia_projs.png differ diff --git a/docs/my-website/sidebars.js b/docs/my-website/sidebars.js index a4e59a845..4a18d0430 100644 --- a/docs/my-website/sidebars.js +++ b/docs/my-website/sidebars.js @@ -248,6 +248,7 @@ const sidebars = { type: "category", label: "Tutorials", items: [ + 'tutorials/litellm_proxy_aporia', 'tutorials/azure_openai', 'tutorials/instructor', "tutorials/gradio_integration", diff --git a/litellm/proxy/proxy_config.yaml b/litellm/proxy/proxy_config.yaml index 902dab7ad..6027b8b1c 100644 --- a/litellm/proxy/proxy_config.yaml +++ b/litellm/proxy/proxy_config.yaml @@ -4,8 +4,15 @@ model_list: model: openai/gpt-3.5-turbo api_key: os.environ/OPENAI_API_KEY -litellm_settings: - guardrails: - - prompt_injection: - callbacks: [aporio_prompt_injection] - default_on: true +guardrails: + - guardrail_name: prompt_injection_detection + litellm_params: + guardrail_name: openai/gpt-3.5-turbo + api_key: os.environ/OPENAI_API_KEY + api_base: os.environ/OPENAI_API_BASE + - guardrail_name: prompt_injection_detection + litellm_params: + guardrail_name: openai/gpt-3.5-turbo + api_key: os.environ/OPENAI_API_KEY + api_base: os.environ/OPENAI_API_BASE +