mirror of
https://github.com/BerriAI/litellm.git
synced 2025-04-24 18:24:20 +00:00
docs(proxy_server): doc cleanup
This commit is contained in:
parent
051b21b61f
commit
51e5e2b8d5
3 changed files with 105 additions and 68 deletions
3
.gitignore
vendored
3
.gitignore
vendored
|
@ -3,4 +3,5 @@
|
|||
litellm_uuid.txt
|
||||
__pycache__/
|
||||
bun.lockb
|
||||
**/.DS_Store
|
||||
**/.DS_Store
|
||||
.aider*
|
||||
|
|
|
@ -3,38 +3,25 @@ import TabItem from '@theme/TabItem';
|
|||
|
||||
# OpenAI Proxy Server
|
||||
|
||||
CLI Tool to create a LLM Proxy Server to translate openai api calls to any non-openai model (e.g. Huggingface, TogetherAI, Ollama, etc.) 100+ models [Provider List](https://docs.litellm.ai/docs/providers).
|
||||
A local, fast, and lightweight OpenAI-compatible server to call 100+ LLM APIs.
|
||||
|
||||
## Quick start
|
||||
Call Ollama models through your OpenAI proxy.
|
||||
|
||||
### Start Proxy
|
||||
## usage
|
||||
```shell
|
||||
$ pip install litellm
|
||||
pip install litellm
|
||||
```
|
||||
```shell
|
||||
$ litellm --model ollama/llama2
|
||||
$ litellm --model ollama/codellama
|
||||
|
||||
#INFO: Uvicorn running on http://0.0.0.0:8000
|
||||
#INFO: Ollama running on http://0.0.0.0:8000
|
||||
```
|
||||
|
||||
This will host a local proxy api at: **http://0.0.0.0:8000**
|
||||
### test
|
||||
In a new shell, run:
|
||||
```shell
|
||||
$ litellm --test
|
||||
```
|
||||
|
||||
Let's see if it works
|
||||
```shell
|
||||
$ curl --location 'http://0.0.0.0:8000/chat/completions' \
|
||||
--header 'Content-Type: application/json' \
|
||||
--data '{
|
||||
"messages": [
|
||||
{
|
||||
"role": "user",
|
||||
"content": "what do you know?"
|
||||
}
|
||||
],
|
||||
}'
|
||||
```
|
||||
|
||||
### Replace OpenAI Base
|
||||
### replace openai base
|
||||
|
||||
```python
|
||||
import openai
|
||||
|
@ -145,6 +132,81 @@ $ litellm --model command-nightly
|
|||
|
||||
[**Jump to Code**](https://github.com/BerriAI/litellm/blob/fef4146396d5d87006259e00095a62e3900d6bb4/litellm/proxy.py#L36)
|
||||
|
||||
## [tutorial]: Use with Aider/AutoGen/Continue-Dev
|
||||
|
||||
Here's how to use the proxy to test codellama/mistral/etc. models for different github repos
|
||||
|
||||
```shell
|
||||
pip install litellm
|
||||
```
|
||||
|
||||
```shell
|
||||
$ ollama pull codellama # OUR Local CodeLlama
|
||||
|
||||
$ litellm --model ollama/codellama --temperature 0.3 --max_tokens 2048
|
||||
```
|
||||
|
||||
Implementation for different repos
|
||||
<Tabs>
|
||||
<TabItem value="aider" label="Aider">
|
||||
|
||||
```shell
|
||||
$ pip install aider
|
||||
|
||||
$ aider --openai-api-base http://0.0.0.0:8000 --openai-api-key fake-key
|
||||
```
|
||||
</TabItem>
|
||||
<TabItem value="continue-dev" label="ContinueDev">
|
||||
|
||||
Continue-Dev brings ChatGPT to VSCode. See how to [install it here](https://continue.dev/docs/quickstart).
|
||||
|
||||
In the [config.py](https://continue.dev/docs/reference/Models/openai) set this as your default model.
|
||||
```python
|
||||
default=OpenAI(
|
||||
api_key="IGNORED",
|
||||
model="fake-model-name",
|
||||
context_length=2048,
|
||||
api_base="http://your_litellm_hostname:8000"
|
||||
),
|
||||
```
|
||||
|
||||
Credits [@vividfog](https://github.com/jmorganca/ollama/issues/305#issuecomment-1751848077) for this tutorial.
|
||||
</TabItem>
|
||||
<TabItem value="autogen" label="AutoGen">
|
||||
|
||||
```python
|
||||
pip install pyautogen
|
||||
```
|
||||
|
||||
```python
|
||||
from autogen import AssistantAgent, UserProxyAgent, oai
|
||||
config_list=[
|
||||
{
|
||||
"model": "my-fake-model",
|
||||
"api_base": "http://localhost:8000/v1", #litellm compatible endpoint
|
||||
"api_type": "open_ai",
|
||||
"api_key": "NULL", # just a placeholder
|
||||
}
|
||||
]
|
||||
|
||||
response = oai.Completion.create(config_list=config_list, prompt="Hi")
|
||||
print(response) # works fine
|
||||
|
||||
assistant = AssistantAgent("assistant")
|
||||
user_proxy = UserProxyAgent("user_proxy")
|
||||
user_proxy.initiate_chat(assistant, message="Plot a chart of META and TESLA stock price change YTD.", config_list=config_list)
|
||||
# fails with the error: openai.error.AuthenticationError: No API key provided.
|
||||
```
|
||||
|
||||
Credits [@victordibia](https://github.com/microsoft/autogen/issues/45#issuecomment-1749921972) for this tutorial.
|
||||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
:::note
|
||||
**Contribute** Using this server with a project? Contribute your tutorial here!
|
||||
|
||||
:::
|
||||
|
||||
## Configure Model
|
||||
|
||||
To save api keys and/or customize model prompt, run:
|
||||
|
@ -207,44 +269,3 @@ This will host a ChatCompletions API at: https://api.litellm.ai/44508ad4
|
|||
</TabItem>
|
||||
</Tabs>
|
||||
|
||||
|
||||
## Tutorial - using HuggingFace LLMs with aider
|
||||
[Aider](https://github.com/paul-gauthier/aider) is an AI pair programming in your terminal.
|
||||
|
||||
But it only accepts OpenAI API Calls.
|
||||
|
||||
In this tutorial we'll use Aider with WizardCoder (hosted on HF Inference Endpoints).
|
||||
|
||||
[NOTE]: To learn how to deploy a model on Huggingface
|
||||
|
||||
### Step 1: Install aider and litellm
|
||||
```shell
|
||||
$ pip install aider-chat litellm
|
||||
```
|
||||
|
||||
### Step 2: Spin up local proxy
|
||||
Save your huggingface api key in your local environment (can also do this via .env)
|
||||
|
||||
```shell
|
||||
$ export HUGGINGFACE_API_KEY=my-huggingface-api-key
|
||||
```
|
||||
|
||||
Point your local proxy to your model endpoint
|
||||
|
||||
```shell
|
||||
$ litellm \
|
||||
--model huggingface/WizardLM/WizardCoder-Python-34B-V1.0 \
|
||||
--api_base https://my-endpoint.huggingface.com
|
||||
```
|
||||
This will host a local proxy api at: **http://0.0.0.0:8000**
|
||||
|
||||
### Step 3: Replace openai api base in Aider
|
||||
Aider lets you set the openai api base. So lets point it to our proxy instead.
|
||||
|
||||
```shell
|
||||
$ aider --openai-api-base http://0.0.0.0:8000
|
||||
```
|
||||
|
||||
|
||||
|
||||
And that's it!
|
|
@ -1,4 +1,4 @@
|
|||
import sys, os
|
||||
import sys, os, platform
|
||||
sys.path.insert(
|
||||
0, os.path.abspath("../..")
|
||||
) # Adds the parent directory to the system path
|
||||
|
@ -19,7 +19,7 @@ print()
|
|||
import litellm
|
||||
from fastapi import FastAPI, Request
|
||||
from fastapi.routing import APIRouter
|
||||
from fastapi.responses import StreamingResponse
|
||||
from fastapi.responses import StreamingResponse, FileResponse
|
||||
import json
|
||||
|
||||
app = FastAPI()
|
||||
|
@ -203,4 +203,19 @@ async def chat_completion(request: Request):
|
|||
print_verbose(f"response: {response}")
|
||||
return response
|
||||
|
||||
|
||||
@router.get("/ollama_logs")
|
||||
async def retrieve_server_log(request: Request):
|
||||
filepath = os.path.expanduser('~/.ollama/logs/server.log')
|
||||
return FileResponse(filepath)
|
||||
|
||||
# @router.get("/ollama_logs")
|
||||
# async def chat_completion(request: Request):
|
||||
# if platform.system() == "Darwin":
|
||||
# print("This is a MacOS system.")
|
||||
# elif platform.system() == "Linux":
|
||||
# print("This is a Linux system.")
|
||||
# else:
|
||||
# print("This is an unknown operating system.")
|
||||
|
||||
app.include_router(router)
|
Loading…
Add table
Add a link
Reference in a new issue