forked from phoenix/litellm-mirror
Update readme.md
This commit is contained in:
parent
4f403d06d1
commit
25bd80a5aa
1 changed files with 81 additions and 59 deletions
|
@ -1,71 +1,93 @@
|
|||
# *🚅 litellm*
|
||||
# Proxy Server for Azure, Llama2, OpenAI, Claude, Hugging Face, Replicate Models
|
||||
[](https://pypi.org/project/litellm/)
|
||||
[](https://pypi.org/project/litellm/0.1.1/)
|
||||
[](https://dl.circleci.com/status-badge/redirect/gh/BerriAI/litellm/tree/main)
|
||||

|
||||
[](https://github.com/BerriAI/litellm)
|
||||
|
||||
[](https://discord.gg/wuPM9dRgDw)
|
||||
|
||||
a light package to simplify calling OpenAI, Azure, Cohere, Anthropic, Huggingface API Endpoints. It manages:
|
||||
- translating inputs to the provider's completion and embedding endpoints
|
||||
- guarantees [consistent output](https://litellm.readthedocs.io/en/latest/output/), text responses will always be available at `['choices'][0]['message']['content']`
|
||||
- exception mapping - common exceptions across providers are mapped to the [OpenAI exception types](https://help.openai.com/en/articles/6897213-openai-library-error-types-guidance)
|
||||
# usage
|
||||
Demo - https://litellm.ai/ \
|
||||
Read the docs - https://litellm.readthedocs.io/en/latest/
|
||||
# Proxy Server for Chat API
|
||||
|
||||
## quick start
|
||||
```
|
||||
pip install litellm
|
||||
This repository contains a proxy server that interacts with OpenAI's Chat API and other similar APIs to facilitate chat-based language models. The server allows you to easily integrate chat completion capabilities into your applications. The server is built using Python and the Flask framework.
|
||||
|
||||
## Installation
|
||||
|
||||
To set up and run the proxy server locally, follow these steps:
|
||||
|
||||
1. Clone this repository to your local machine:
|
||||
|
||||
|
||||
2. Install the required dependencies using pip:
|
||||
|
||||
`pip install -r requirements.txt`
|
||||
|
||||
3. Configure the server settings, such as API keys and model endpoints, in the configuration file (`config.py`).
|
||||
|
||||
4. Run the server:
|
||||
|
||||
`python app.py`
|
||||
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### `/chat/completions` (POST)
|
||||
|
||||
This endpoint is used to generate chat completions. It takes in JSON data with the following parameters:
|
||||
|
||||
- `model` (string, required): ID of the model to use for chat completions. Refer to the model endpoint compatibility table for supported models.
|
||||
- `messages` (array, required): A list of messages representing the conversation context. Each message should have a `role` (system, user, assistant, or function), `content` (message text), and `name` (for function role).
|
||||
- Additional parameters for controlling completions, such as `temperature`, `top_p`, `n`, etc.
|
||||
|
||||
Example JSON payload:
|
||||
|
||||
```json
|
||||
{
|
||||
"model": "gpt-3.5-turbo",
|
||||
"messages": [
|
||||
{"role": "system", "content": "You are a helpful assistant."},
|
||||
{"role": "user", "content": "Knock knock."},
|
||||
{"role": "assistant", "content": "Who's there?"},
|
||||
{"role": "user", "content": "Orange."}
|
||||
],
|
||||
"temperature": 0.8
|
||||
}
|
||||
```
|
||||
|
||||
```python
|
||||
from litellm import completion
|
||||
|
||||
messages = [{ "content": "Hello, how are you?","role": "user"}]
|
||||
## Input Parameters
|
||||
model: ID of the language model to use.
|
||||
messages: An array of messages representing the conversation context.
|
||||
role: The role of the message author (system, user, assistant, or function).
|
||||
content: The content of the message.
|
||||
name: The name of the author (required for function role).
|
||||
function_call: The name and arguments of a function to call.
|
||||
functions: A list of functions the model may generate JSON inputs for.
|
||||
Various other parameters for controlling completion behavior.
|
||||
Supported Models
|
||||
The proxy server supports the following models:
|
||||
|
||||
# openai call
|
||||
response = completion(model="gpt-3.5-turbo", messages=messages)
|
||||
|
||||
# cohere call
|
||||
response = completion("command-nightly", messages)
|
||||
|
||||
# azure openai call
|
||||
response = completion("chatgpt-test", messages, azure=True)
|
||||
|
||||
# hugging face call
|
||||
response = completion(model="stabilityai/stablecode-completion-alpha-3b-4k", messages=messages, hugging_face=True)
|
||||
|
||||
# openrouter call
|
||||
response = completion("google/palm-2-codechat-bison", messages)
|
||||
```
|
||||
Code Sample: [Getting Started Notebook](https://colab.research.google.com/drive/1gR3pY-JzDZahzpVdbGBtrNGDBmzUNJaJ?usp=sharing)
|
||||
|
||||
Stable version
|
||||
```
|
||||
pip install litellm==0.1.345
|
||||
```
|
||||
|
||||
## Streaming Queries
|
||||
liteLLM supports streaming the model response back, pass `stream=True` to get a streaming iterator in response.
|
||||
Streaming is supported for OpenAI, Azure, Anthropic models
|
||||
```python
|
||||
response = completion(model="gpt-3.5-turbo", messages=messages, stream=True)
|
||||
for chunk in response:
|
||||
print(chunk['choices'][0]['delta'])
|
||||
|
||||
# claude 2
|
||||
result = completion('claude-2', messages, stream=True)
|
||||
for chunk in result:
|
||||
print(chunk['choices'][0]['delta'])
|
||||
```
|
||||
|
||||
# support / talk with founders
|
||||
- [Our calendar 👋](https://calendly.com/d/4mp-gd3-k5k/berriai-1-1-onboarding-litellm-hosted-version)
|
||||
- [Community Discord 💭](https://discord.gg/wuPM9dRgDw)
|
||||
- Our numbers 📞 +1 (770) 8783-106 / +1 (412) 618-6238
|
||||
- Our emails ✉️ ishaan@berri.ai / krrish@berri.ai
|
||||
|
||||
# why did we build this
|
||||
- **Need for simplicity**: Our code started to get extremely complicated managing & translating calls between Azure, OpenAI, Cohere
|
||||
OpenAI Chat Completion Models:
|
||||
gpt-4
|
||||
gpt-4-0613
|
||||
gpt-4-32k
|
||||
...
|
||||
OpenAI Text Completion Models:
|
||||
text-davinci-003
|
||||
Cohere Models:
|
||||
command-nightly
|
||||
command
|
||||
...
|
||||
Anthropic Models:
|
||||
claude-2
|
||||
claude-instant-1
|
||||
...
|
||||
Replicate Models:
|
||||
replicate/
|
||||
OpenRouter Models:
|
||||
google/palm-2-codechat-bison
|
||||
google/palm-2-chat-bison
|
||||
...
|
||||
Vertex Models:
|
||||
chat-bison
|
||||
chat-bison@001
|
||||
Refer to the model endpoint compatibility table for more details.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue