mirror of
https://github.com/BerriAI/litellm.git
synced 2025-04-24 18:24:20 +00:00
.. | ||
.gitignore | ||
__init__.py | ||
api_log.json | ||
config.yaml | ||
cost.log | ||
costs.json | ||
openapi.json | ||
proxy_cli.py | ||
proxy_server.py | ||
README.md | ||
start.sh |
litellm-proxy
A local, fast, and lightweight OpenAI-compatible server to call 100+ LLM APIs.
usage
$ pip install litellm
$ litellm --model ollama/codellama
#INFO: Ollama running on http://0.0.0.0:8000
replace openai base
import openai
openai.api_base = "http://0.0.0.0:8000"
print(openai.ChatCompletion.create(model="test", messages=[{"role":"user", "content":"Hey!"}]))
See how to call Huggingface,Bedrock,TogetherAI,Anthropic, etc.
configure proxy
To save API Keys, change model prompt, etc. you'll need to create a local instance of it:
$ litellm --create-proxy
This will create a local project called litellm-proxy
in your current directory, that has:
- proxy_cli.py: Runs the proxy
- proxy_server.py: Contains the API calling logic
/chat/completions
: receivesopenai.ChatCompletion.create
call./completions
: receivesopenai.Completion.create
call./models
: receivesopenai.Model.list()
call
- secrets.toml: Stores your api keys, model configs, etc.
Run it by doing:
$ cd litellm-proxy
$ python proxy_cli.py --model ollama/llama # replace with your model name