litellm-mirror/litellm/proxy
2023-11-15 13:15:16 -08:00
..
tests refactor: fixing linting issues 2023-11-11 18:52:28 -08:00
.gitignore fix(gitmodules): remapping to new proxy 2023-10-12 21:23:53 -07:00
__init__.py update proxy cli 2023-09-28 16:24:41 -07:00
config.yaml add ollama/zephyr to config 2023-11-15 13:04:34 -08:00
openapi.json (feat) add swagger.json for litellm proxy 2023-10-13 20:41:04 -07:00
proxy_cli.py fix(main.py): misrouting ollama models to nlp cloud 2023-11-14 18:55:08 -08:00
proxy_server.py fix(azure.py-+-proxy_server.py): fix function calling response object + support router on proxy 2023-11-15 13:15:16 -08:00
README.md (fix) proxy + docs: use openai.chat.completions.create instead of openai.ChatCompletions 2023-11-13 08:24:26 -08:00
start.sh fix(factory.py): fixing llama-2 non-chat models prompt templating 2023-11-07 21:33:54 -08:00
utils.py fix(factory.py): fixing llama-2 non-chat models prompt templating 2023-11-07 21:33:54 -08:00

litellm-proxy

A local, fast, and lightweight OpenAI-compatible server to call 100+ LLM APIs.

usage

$ pip install litellm
$ litellm --model ollama/codellama 

#INFO: Ollama running on http://0.0.0.0:8000

replace openai base

import openai 

openai.api_base = "http://0.0.0.0:8000"

print(openai.chat.completions.create(model="test", messages=[{"role":"user", "content":"Hey!"}]))

See how to call Huggingface,Bedrock,TogetherAI,Anthropic, etc.