Update readme.md

2023-08-11 13:25:48 -07:00 · 2023-08-11 13:25:48 -07:00 · 25bd80a5aa
commit 25bd80a5aa
parent 4f403d06d1
1 changed files with 81 additions and 59 deletions
--- a/cookbook/proxy-server/readme.md
+++ b/cookbook/proxy-server/readme.md
@ -1,71 +1,93 @@
-# *🚅 litellm*
+# Proxy Server for Azure, Llama2, OpenAI, Claude, Hugging Face, Replicate Models
 [![PyPI Version](https://img.shields.io/pypi/v/litellm.svg)](https://pypi.org/project/litellm/)
 [![PyPI Version](https://img.shields.io/badge/stable%20version-v0.1.345-blue?color=green&link=https://pypi.org/project/litellm/0.1.1/)](https://pypi.org/project/litellm/0.1.1/)
-[![CircleCI](https://dl.circleci.com/status-badge/img/gh/BerriAI/litellm/tree/main.svg?style=svg)](https://dl.circleci.com/status-badge/redirect/gh/BerriAI/litellm/tree/main)
 ![Downloads](https://img.shields.io/pypi/dm/litellm)
 [![litellm](https://img.shields.io/badge/%20%F0%9F%9A%85%20liteLLM-OpenAI%7CAzure%7CAnthropic%7CPalm%7CCohere%7CReplicate%7CHugging%20Face-blue?color=green)](https://github.com/BerriAI/litellm)

 [![](https://dcbadge.vercel.app/api/server/wuPM9dRgDw)](https://discord.gg/wuPM9dRgDw)

-a light package to simplify calling OpenAI, Azure, Cohere, Anthropic, Huggingface API Endpoints. It manages:
- translating inputs to the provider's completion and embedding endpoints
- guarantees [consistent output](https://litellm.readthedocs.io/en/latest/output/), text responses will always be available at `['choices'][0]['message']['content']`
- exception mapping - common exceptions across providers are mapped to the [OpenAI exception types](https://help.openai.com/en/articles/6897213-openai-library-error-types-guidance)
-# usage
-Demo - https://litellm.ai/ \
-Read the docs - https://litellm.readthedocs.io/en/latest/
+# Proxy Server for Chat API

-## quick start
-```
-pip install litellm
+This repository contains a proxy server that interacts with OpenAI's Chat API and other similar APIs to facilitate chat-based language models. The server allows you to easily integrate chat completion capabilities into your applications. The server is built using Python and the Flask framework.
+
+## Installation
+
+To set up and run the proxy server locally, follow these steps:
+
+1. Clone this repository to your local machine:
+
+
+2. Install the required dependencies using pip:
+
+`pip install -r requirements.txt`
+
+3. Configure the server settings, such as API keys and model endpoints, in the configuration file (`config.py`).
+
+4. Run the server:
+
+`python app.py`
+
+
+## API Endpoints
+
+### `/chat/completions` (POST)
+
+This endpoint is used to generate chat completions. It takes in JSON data with the following parameters:
+
+- `model` (string, required): ID of the model to use for chat completions. Refer to the model endpoint compatibility table for supported models.
+- `messages` (array, required): A list of messages representing the conversation context. Each message should have a `role` (system, user, assistant, or function), `content` (message text), and `name` (for function role).
+- Additional parameters for controlling completions, such as `temperature`, `top_p`, `n`, etc.
+
+Example JSON payload:
+
+```json
+{
+"model": "gpt-3.5-turbo",
+"messages": [
+ {"role": "system", "content": "You are a helpful assistant."},
+ {"role": "user", "content": "Knock knock."},
+ {"role": "assistant", "content": "Who's there?"},
+ {"role": "user", "content": "Orange."}
+],
+"temperature": 0.8
+}
 ```

-```python
-from litellm import completion

-messages = [{ "content": "Hello, how are you?","role": "user"}]
+## Input Parameters
+model: ID of the language model to use.
+messages: An array of messages representing the conversation context.
+role: The role of the message author (system, user, assistant, or function).
+content: The content of the message.
+name: The name of the author (required for function role).
+function_call: The name and arguments of a function to call.
+functions: A list of functions the model may generate JSON inputs for.
+Various other parameters for controlling completion behavior.
+Supported Models
+The proxy server supports the following models:

-# openai call
-response = completion(model="gpt-3.5-turbo", messages=messages)
-
-# cohere call
-response = completion("command-nightly", messages)
-
-# azure openai call
-response = completion("chatgpt-test", messages, azure=True)
-
-# hugging face call
-response = completion(model="stabilityai/stablecode-completion-alpha-3b-4k", messages=messages, hugging_face=True)
-
-# openrouter call
-response = completion("google/palm-2-codechat-bison", messages)
-```
-Code Sample: [Getting Started Notebook](https://colab.research.google.com/drive/1gR3pY-JzDZahzpVdbGBtrNGDBmzUNJaJ?usp=sharing)
-
-Stable version
-```
-pip install litellm==0.1.345
-```
-
-## Streaming Queries
-liteLLM supports streaming the model response back, pass `stream=True` to get a streaming iterator in response.
-Streaming is supported for OpenAI, Azure, Anthropic models
-```python
-response = completion(model="gpt-3.5-turbo", messages=messages, stream=True)
-for chunk in response:
-    print(chunk['choices'][0]['delta'])
-
-# claude 2
-result = completion('claude-2', messages, stream=True)
-for chunk in result:
-  print(chunk['choices'][0]['delta'])
-```
-
-# support / talk with founders
- [Our calendar 👋](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version)
- [Community Discord 💭](https://discord.gg/wuPM9dRgDw)
- Our numbers 📞 +1 (770) 8783-106 / ‭+1 (412) 618-6238‬
- Our emails ✉️ ishaan@berri.ai / krrish@berri.ai
-
-# why did we build this 
- **Need for simplicity**: Our code started to get extremely complicated managing & translating calls between Azure, OpenAI, Cohere
+OpenAI Chat Completion Models:
+gpt-4
+gpt-4-0613
+gpt-4-32k
+...
+OpenAI Text Completion Models:
+text-davinci-003
+Cohere Models:
+command-nightly
+command
+...
+Anthropic Models:
+claude-2
+claude-instant-1
+...
+Replicate Models:
+replicate/
+OpenRouter Models:
+google/palm-2-codechat-bison
+google/palm-2-chat-bison
+...
+Vertex Models:
+chat-bison
+chat-bison@001
+Refer to the model endpoint compatibility table for more details.