mirror of https://github.com/BerriAI/litellm.git synced 2025-04-24 18:24:20 +00:00

History

ishaan-jaff 2952155452 (fix) litellm proxy for railways deploys `		2023-10-21 14:04:28 -07:00
..
tests	(fix) proxy server dockerfile	2023-10-21 13:50:48 -07:00
main.py	(fix) litellm proxy for railways deploys `	2023-10-21 14:04:28 -07:00
README.md	(feat) add litellm proxy dir	2023-10-21 10:31:03 -07:00

README.md

litellm-proxy

A local, fast, and lightweight OpenAI-compatible server to call 100+ LLM APIs.

usage

$ pip install litellm

$ litellm --model ollama/codellama 

#INFO: Ollama running on http://0.0.0.0:8000

replace openai base

import openai 

openai.api_base = "http://0.0.0.0:8000"

print(openai.ChatCompletion.create(model="test", messages=[{"role":"user", "content":"Hey!"}]))

See how to call Huggingface,Bedrock,TogetherAI,Anthropic, etc.

configure proxy

To save API Keys, change model prompt, etc. you'll need to create a local instance of it:

$ litellm --create-proxy

This will create a local project called litellm-proxy in your current directory, that has:

proxy_cli.py: Runs the proxy
proxy_server.py: Contains the API calling logic
- /chat/completions: receives openai.ChatCompletion.create call.
- /completions: receives openai.Completion.create call.
- /models: receives openai.Model.list() call
secrets.toml: Stores your api keys, model configs, etc.

Run it by doing:

$ cd litellm-proxy

$ python proxy_cli.py --model ollama/llama # replace with your model name