diff --git a/docs/my-website/docs/proxy_server.md b/docs/my-website/docs/proxy_server.md index 0115881dfe..a962a70f94 100644 --- a/docs/my-website/docs/proxy_server.md +++ b/docs/my-website/docs/proxy_server.md @@ -1,25 +1,113 @@ +import Tabs from '@theme/Tabs'; +import TabItem from '@theme/TabItem'; + # OpenAI Proxy Server Use this to spin up a proxy api to translate openai api calls to any non-openai model (e.g. Huggingface, TogetherAI, Ollama, etc.) This works for async + streaming as well. -**Requirements** Make sure relevant keys are set in the local .env. [Provider List](https://docs.litellm.ai/docs/providers) -## usage +Works with **ALL MODELS** supported by LiteLLM. To see supported providers check out this list - [Provider List](https://docs.litellm.ai/docs/providers). + +**Requirements** Make sure relevant keys are set in the local .env. +## quick start +Call Huggingface models through your OpenAI proxy. + +Run this in your CLI. ```shell -pip install litellm +$ pip install litellm ``` +```shell +$ export HUGGINGFACE_API_KEY=your-api-key # [OPTIONAL] + +$ litellm --model huggingface/stabilityai/stablecode-instruct-alpha-3b +``` + +This will host a local proxy api at: **http://0.0.0.0:8000** + +Other supported models: + + ```shell -litellm --model +$ export ANTHROPIC_API_KEY=my-api-key +$ litellm --model claude-instant-1 ``` -This will host a local proxy api at : **http://localhost:8000** + + + + +```shell +$ export TOGETHERAI_API_KEY=my-api-key +$ litellm --model together_ai/lmsys/vicuna-13b-v1.5-16k +``` + + + + + +```shell +$ export REPLICATE_API_KEY=my-api-key +$ litellm \ + --model replicate/meta/llama-2-70b-chat:02e509c789964a7ea8736978a43525956ef40397be9033abf9fd2badfe68c9e3 +``` + + + + + +```shell +$ litellm --model petals/meta-llama/Llama-2-70b-chat-hf +``` + + + + + +```shell +$ export PALM_API_KEY=my-palm-key +$ litellm --model palm/chat-bison +``` + + + + + +```shell +$ export AZURE_API_KEY=my-api-key +$ export AZURE_API_BASE=my-api-base +$ export AZURE_API_VERSION=my-api-version + +$ litellm --model azure/my-deployment-id +``` + + + + + +```shell +$ export AI21_API_KEY=my-api-key +$ litellm --model j2-light +``` + + + + + +```shell +$ export COHERE_API_KEY=my-api-key +$ litellm --model command-nightly +``` + + + + [**Jump to Code**](https://github.com/BerriAI/litellm/blob/fef4146396d5d87006259e00095a62e3900d6bb4/litellm/proxy.py#L36) -## [Advanced] setting api base -If your model is running locally or on a custom endpoint +## setting api base +For **local model** or model on **custom endpoint** Pass in the api_base as well @@ -27,13 +115,120 @@ Pass in the api_base as well litellm --model huggingface/meta-llama/llama2 --api_base https://my-endpoint.huggingface.cloud ``` + + + +```python +from litellm import completion +import os + +## set ENV variables +os.environ["OPENAI_API_KEY"] = "sk-litellm-7_NPZhMGxY2GoHC59LgbDw" # [OPTIONAL] replace with your openai key + +response = completion( + model="gpt-3.5-turbo", + messages=[{ "content": "Hello, how are you?","role": "user"}] +) +``` + + + + +```python +from litellm import completion +import os + +## set ENV variables +os.environ["ANTHROPIC_API_KEY"] = "sk-litellm-7_NPZhMGxY2GoHC59LgbDw" # [OPTIONAL] replace with your openai key + +response = completion( + model="claude-2", + messages=[{ "content": "Hello, how are you?","role": "user"}] +) +``` + + + + + +```python +from litellm import completion +import os + +# auth: run 'gcloud auth application-default' +os.environ["VERTEX_PROJECT"] = "hardy-device-386718" +os.environ["VERTEX_LOCATION"] = "us-central1" + +response = completion( + model="chat-bison", + messages=[{ "content": "Hello, how are you?","role": "user"}] +) +``` + + + + + +```python +from litellm import completion +import os + +os.environ["HUGGINGFACE_API_KEY"] = "huggingface_api_key" + +# e.g. Call 'WizardLM/WizardCoder-Python-34B-V1.0' hosted on HF Inference endpoints +response = completion( + model="huggingface/WizardLM/WizardCoder-Python-34B-V1.0", + messages=[{ "content": "Hello, how are you?","role": "user"}], + api_base="https://my-endpoint.huggingface.cloud" +) + +print(response) +``` + + + + + +```python +from litellm import completion +import os + +## set ENV variables +os.environ["AZURE_API_KEY"] = "" +os.environ["AZURE_API_BASE"] = "" +os.environ["AZURE_API_VERSION"] = "" + +# azure call +response = completion( + "azure/", + messages = [{ "content": "Hello, how are you?","role": "user"}] +) +``` + + + + + + +```python +from litellm import completion + +response = completion( + model="ollama/llama2", + messages = [{ "content": "Hello, how are you?","role": "user"}], + api_base="http://localhost:11434" +) +``` + + + + ## test it ```curl curl --location 'http://0.0.0.0:8000/chat/completions' \ --header 'Content-Type: application/json' \ --data '{ - "model": "gpt-3.5-turbo", "messages": [ { "role": "user", @@ -43,3 +238,42 @@ curl --location 'http://0.0.0.0:8000/chat/completions' \ }' ``` +## tutorial - using with aider +[Aider](https://github.com/paul-gauthier/aider) is an AI pair programming in your terminal. + +But it only accepts OpenAI API Calls. + +In this tutorial we'll use Aider with WizardCoder (hosted on HF Inference Endpoints). + +[NOTE]: To learn how to deploy a model on Huggingface + +### Step 1: Install aider and litellm +```shell +$ pip install aider-chat litellm +``` + +### Step 2: Spin up local proxy +Save your huggingface api key in your local environment (can also do this via .env) + +```shell +$ export HUGGINGFACE_API_KEY=my-huggingface-api-key +``` + +Point your local proxy to your model endpoint + +```shell +$ litellm \ + --model huggingface/WizardLM/WizardCoder-Python-34B-V1.0 \ + --api_base https://my-endpoint.huggingface.com +``` +This will host a local proxy api at: **http://0.0.0.0:8000** + +### Step 3: Replace openai api base in Aider +Aider lets you set the openai api base. So lets point it to our proxy instead. + +```shell +$ aider --openai-api-base http://0.0.0.0:8000 +``` + + + diff --git a/litellm/proxy/proxy_cli.py b/litellm/proxy/proxy_cli.py index 82856e4589..75bad1b6fb 100644 --- a/litellm/proxy/proxy_cli.py +++ b/litellm/proxy/proxy_cli.py @@ -7,8 +7,7 @@ load_dotenv() @click.option('--api_base', default=None, help='API base URL.') @click.option('--model', required=True, help='The model name to pass to litellm expects') def run_server(port, api_base, model): - # from .proxy_server import app, initialize - from proxy_server import app, initialize + from .proxy_server import app, initialize initialize(model, api_base) try: import uvicorn diff --git a/pyproject.toml b/pyproject.toml index 9fcf6f344b..ad35986de9 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [tool.poetry] name = "litellm" -version = "0.1.793" +version = "0.1.794" description = "Library to easily interface with LLM API providers" authors = ["BerriAI"] license = "MIT License"