From 25bd80a5aa3246a1d1130ccb30d5ffbfb3244cb6 Mon Sep 17 00:00:00 2001 From: Ishaan Jaff Date: Fri, 11 Aug 2023 13:25:48 -0700 Subject: [PATCH] Update readme.md --- cookbook/proxy-server/readme.md | 140 ++++++++++++++++++-------------- 1 file changed, 81 insertions(+), 59 deletions(-) diff --git a/cookbook/proxy-server/readme.md b/cookbook/proxy-server/readme.md index 488c211b8..37087a7a6 100644 --- a/cookbook/proxy-server/readme.md +++ b/cookbook/proxy-server/readme.md @@ -1,71 +1,93 @@ -# *🚅 litellm* +# Proxy Server for Azure, Llama2, OpenAI, Claude, Hugging Face, Replicate Models [![PyPI Version](https://img.shields.io/pypi/v/litellm.svg)](https://pypi.org/project/litellm/) [![PyPI Version](https://img.shields.io/badge/stable%20version-v0.1.345-blue?color=green&link=https://pypi.org/project/litellm/0.1.1/)](https://pypi.org/project/litellm/0.1.1/) -[![CircleCI](https://dl.circleci.com/status-badge/img/gh/BerriAI/litellm/tree/main.svg?style=svg)](https://dl.circleci.com/status-badge/redirect/gh/BerriAI/litellm/tree/main) ![Downloads](https://img.shields.io/pypi/dm/litellm) [![litellm](https://img.shields.io/badge/%20%F0%9F%9A%85%20liteLLM-OpenAI%7CAzure%7CAnthropic%7CPalm%7CCohere%7CReplicate%7CHugging%20Face-blue?color=green)](https://github.com/BerriAI/litellm) [![](https://dcbadge.vercel.app/api/server/wuPM9dRgDw)](https://discord.gg/wuPM9dRgDw) -a light package to simplify calling OpenAI, Azure, Cohere, Anthropic, Huggingface API Endpoints. It manages: -- translating inputs to the provider's completion and embedding endpoints -- guarantees [consistent output](https://litellm.readthedocs.io/en/latest/output/), text responses will always be available at `['choices'][0]['message']['content']` -- exception mapping - common exceptions across providers are mapped to the [OpenAI exception types](https://help.openai.com/en/articles/6897213-openai-library-error-types-guidance) -# usage -Demo - https://litellm.ai/ \ -Read the docs - https://litellm.readthedocs.io/en/latest/ +# Proxy Server for Chat API -## quick start -``` -pip install litellm +This repository contains a proxy server that interacts with OpenAI's Chat API and other similar APIs to facilitate chat-based language models. The server allows you to easily integrate chat completion capabilities into your applications. The server is built using Python and the Flask framework. + +## Installation + +To set up and run the proxy server locally, follow these steps: + +1. Clone this repository to your local machine: + + +2. Install the required dependencies using pip: + +`pip install -r requirements.txt` + +3. Configure the server settings, such as API keys and model endpoints, in the configuration file (`config.py`). + +4. Run the server: + +`python app.py` + + +## API Endpoints + +### `/chat/completions` (POST) + +This endpoint is used to generate chat completions. It takes in JSON data with the following parameters: + +- `model` (string, required): ID of the model to use for chat completions. Refer to the model endpoint compatibility table for supported models. +- `messages` (array, required): A list of messages representing the conversation context. Each message should have a `role` (system, user, assistant, or function), `content` (message text), and `name` (for function role). +- Additional parameters for controlling completions, such as `temperature`, `top_p`, `n`, etc. + +Example JSON payload: + +```json +{ +"model": "gpt-3.5-turbo", +"messages": [ + {"role": "system", "content": "You are a helpful assistant."}, + {"role": "user", "content": "Knock knock."}, + {"role": "assistant", "content": "Who's there?"}, + {"role": "user", "content": "Orange."} +], +"temperature": 0.8 +} ``` -```python -from litellm import completion -messages = [{ "content": "Hello, how are you?","role": "user"}] +## Input Parameters +model: ID of the language model to use. +messages: An array of messages representing the conversation context. +role: The role of the message author (system, user, assistant, or function). +content: The content of the message. +name: The name of the author (required for function role). +function_call: The name and arguments of a function to call. +functions: A list of functions the model may generate JSON inputs for. +Various other parameters for controlling completion behavior. +Supported Models +The proxy server supports the following models: -# openai call -response = completion(model="gpt-3.5-turbo", messages=messages) - -# cohere call -response = completion("command-nightly", messages) - -# azure openai call -response = completion("chatgpt-test", messages, azure=True) - -# hugging face call -response = completion(model="stabilityai/stablecode-completion-alpha-3b-4k", messages=messages, hugging_face=True) - -# openrouter call -response = completion("google/palm-2-codechat-bison", messages) -``` -Code Sample: [Getting Started Notebook](https://colab.research.google.com/drive/1gR3pY-JzDZahzpVdbGBtrNGDBmzUNJaJ?usp=sharing) - -Stable version -``` -pip install litellm==0.1.345 -``` - -## Streaming Queries -liteLLM supports streaming the model response back, pass `stream=True` to get a streaming iterator in response. -Streaming is supported for OpenAI, Azure, Anthropic models -```python -response = completion(model="gpt-3.5-turbo", messages=messages, stream=True) -for chunk in response: - print(chunk['choices'][0]['delta']) - -# claude 2 -result = completion('claude-2', messages, stream=True) -for chunk in result: - print(chunk['choices'][0]['delta']) -``` - -# support / talk with founders -- [Our calendar 👋](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version) -- [Community Discord 💭](https://discord.gg/wuPM9dRgDw) -- Our numbers 📞 +1 (770) 8783-106 / ‭+1 (412) 618-6238‬ -- Our emails ✉️ ishaan@berri.ai / krrish@berri.ai - -# why did we build this -- **Need for simplicity**: Our code started to get extremely complicated managing & translating calls between Azure, OpenAI, Cohere +OpenAI Chat Completion Models: +gpt-4 +gpt-4-0613 +gpt-4-32k +... +OpenAI Text Completion Models: +text-davinci-003 +Cohere Models: +command-nightly +command +... +Anthropic Models: +claude-2 +claude-instant-1 +... +Replicate Models: +replicate/ +OpenRouter Models: +google/palm-2-codechat-bison +google/palm-2-chat-bison +... +Vertex Models: +chat-bison +chat-bison@001 +Refer to the model endpoint compatibility table for more details.