Update readme.md

This commit is contained in:
Ishaan Jaff 2023-08-11 13:25:48 -07:00 committed by GitHub
parent 4f403d06d1
commit 25bd80a5aa
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -1,71 +1,93 @@
# *🚅 litellm*
# Proxy Server for Azure, Llama2, OpenAI, Claude, Hugging Face, Replicate Models
[![PyPI Version](https://img.shields.io/pypi/v/litellm.svg)](https://pypi.org/project/litellm/)
[![PyPI Version](https://img.shields.io/badge/stable%20version-v0.1.345-blue?color=green&link=https://pypi.org/project/litellm/0.1.1/)](https://pypi.org/project/litellm/0.1.1/)
[![CircleCI](https://dl.circleci.com/status-badge/img/gh/BerriAI/litellm/tree/main.svg?style=svg)](https://dl.circleci.com/status-badge/redirect/gh/BerriAI/litellm/tree/main)
![Downloads](https://img.shields.io/pypi/dm/litellm)
[![litellm](https://img.shields.io/badge/%20%F0%9F%9A%85%20liteLLM-OpenAI%7CAzure%7CAnthropic%7CPalm%7CCohere%7CReplicate%7CHugging%20Face-blue?color=green)](https://github.com/BerriAI/litellm)
[![](https://dcbadge.vercel.app/api/server/wuPM9dRgDw)](https://discord.gg/wuPM9dRgDw)
a light package to simplify calling OpenAI, Azure, Cohere, Anthropic, Huggingface API Endpoints. It manages:
- translating inputs to the provider's completion and embedding endpoints
- guarantees [consistent output](https://litellm.readthedocs.io/en/latest/output/), text responses will always be available at `['choices'][0]['message']['content']`
- exception mapping - common exceptions across providers are mapped to the [OpenAI exception types](https://help.openai.com/en/articles/6897213-openai-library-error-types-guidance)
# usage
Demo - https://litellm.ai/ \
Read the docs - https://litellm.readthedocs.io/en/latest/
# Proxy Server for Chat API
## quick start
```
pip install litellm
This repository contains a proxy server that interacts with OpenAI's Chat API and other similar APIs to facilitate chat-based language models. The server allows you to easily integrate chat completion capabilities into your applications. The server is built using Python and the Flask framework.
## Installation
To set up and run the proxy server locally, follow these steps:
1. Clone this repository to your local machine:
2. Install the required dependencies using pip:
`pip install -r requirements.txt`
3. Configure the server settings, such as API keys and model endpoints, in the configuration file (`config.py`).
4. Run the server:
`python app.py`
## API Endpoints
### `/chat/completions` (POST)
This endpoint is used to generate chat completions. It takes in JSON data with the following parameters:
- `model` (string, required): ID of the model to use for chat completions. Refer to the model endpoint compatibility table for supported models.
- `messages` (array, required): A list of messages representing the conversation context. Each message should have a `role` (system, user, assistant, or function), `content` (message text), and `name` (for function role).
- Additional parameters for controlling completions, such as `temperature`, `top_p`, `n`, etc.
Example JSON payload:
```json
{
"model": "gpt-3.5-turbo",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Knock knock."},
{"role": "assistant", "content": "Who's there?"},
{"role": "user", "content": "Orange."}
],
"temperature": 0.8
}
```
```python
from litellm import completion
messages = [{ "content": "Hello, how are you?","role": "user"}]
## Input Parameters
model: ID of the language model to use.
messages: An array of messages representing the conversation context.
role: The role of the message author (system, user, assistant, or function).
content: The content of the message.
name: The name of the author (required for function role).
function_call: The name and arguments of a function to call.
functions: A list of functions the model may generate JSON inputs for.
Various other parameters for controlling completion behavior.
Supported Models
The proxy server supports the following models:
# openai call
response = completion(model="gpt-3.5-turbo", messages=messages)
# cohere call
response = completion("command-nightly", messages)
# azure openai call
response = completion("chatgpt-test", messages, azure=True)
# hugging face call
response = completion(model="stabilityai/stablecode-completion-alpha-3b-4k", messages=messages, hugging_face=True)
# openrouter call
response = completion("google/palm-2-codechat-bison", messages)
```
Code Sample: [Getting Started Notebook](https://colab.research.google.com/drive/1gR3pY-JzDZahzpVdbGBtrNGDBmzUNJaJ?usp=sharing)
Stable version
```
pip install litellm==0.1.345
```
## Streaming Queries
liteLLM supports streaming the model response back, pass `stream=True` to get a streaming iterator in response.
Streaming is supported for OpenAI, Azure, Anthropic models
```python
response = completion(model="gpt-3.5-turbo", messages=messages, stream=True)
for chunk in response:
print(chunk['choices'][0]['delta'])
# claude 2
result = completion('claude-2', messages, stream=True)
for chunk in result:
print(chunk['choices'][0]['delta'])
```
# support / talk with founders
- [Our calendar 👋](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version)
- [Community Discord 💭](https://discord.gg/wuPM9dRgDw)
- Our numbers 📞 +1 (770) 8783-106 / +1 (412) 618-6238
- Our emails ✉️ ishaan@berri.ai / krrish@berri.ai
# why did we build this
- **Need for simplicity**: Our code started to get extremely complicated managing & translating calls between Azure, OpenAI, Cohere
OpenAI Chat Completion Models:
gpt-4
gpt-4-0613
gpt-4-32k
...
OpenAI Text Completion Models:
text-davinci-003
Cohere Models:
command-nightly
command
...
Anthropic Models:
claude-2
claude-instant-1
...
Replicate Models:
replicate/
OpenRouter Models:
google/palm-2-codechat-bison
google/palm-2-chat-bison
...
Vertex Models:
chat-bison
chat-bison@001
Refer to the model endpoint compatibility table for more details.