mirror of
https://github.com/BerriAI/litellm.git
synced 2025-04-26 19:24:27 +00:00
* fix(litellm_proxy_extras): add baselining db script Fixes https://github.com/BerriAI/litellm/issues/9885 * fix(prisma_client.py): fix ruff errors * ci(config.yml): add publish_proxy_extras step * fix(config.yml): compare contents between versions to check for changes * fix(config.yml): fix check * fix: install toml * fix: update check * fix: ensure versions in sync * fix: fix version compare * fix: correct the cost for 'gemini/gemini-2.5-pro-preview-03-25' (#9896) * fix: Typo in the cost 'gemini/gemini-2.5-pro-preview-03-25', closes #9854 * chore: update in backup file as well * Litellm add managed files db (#9930) * fix(openai.py): ensure openai file object shows up on logs * fix(managed_files.py): return unified file id as b64 str allows retrieve file id to work as expected * fix(managed_files.py): apply decoded file id transformation * fix: add unit test for file id + decode logic * fix: initial commit for litellm_proxy support with CRUD Endpoints * fix(managed_files.py): support retrieve file operation * fix(managed_files.py): support for DELETE endpoint for files * fix(managed_files.py): retrieve file content support supports retrieve file content api from openai * fix: fix linting error * test: update tests * fix: fix linting error * feat(managed_files.py): support reading / writing files in DB * feat(managed_files.py): support deleting file from DB on delete * test: update testing * fix(spend_tracking_utils.py): ensure each file create request is logged correctly * fix(managed_files.py): fix storing / returning managed file object from cache * fix(files/main.py): pass litellm params to azure route * test: fix test * build: add new prisma migration * build: bump requirements * test: add more testing * refactor: cleanup post merge w/ main * fix: fix code qa errors * [DB / Infra] Add new column team_member_permissions (#9941) * add team_member_permissions to team table * add migration.sql file * fix poetry lock * fix prisma migrations * fix poetry lock * fix migration * ui new build * fix(factory.py): correct indentation for message index increment in ollama, This fixes bug #9822 (#9943) * fix(factory.py): correct indentation for message index increment in ollama_pt function * test: add unit tests for ollama_pt function handling various message types * ci: update test * fix: fix check * ci: see what dir looks like * ci: more checks * ci: fix filepath * ci: cleanup * ci: fix ci --------- Co-authored-by: Nilanjan De <nilanjan.de@gmail.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Dan Shaw <dan@danieljshaw.com> |
||
---|---|---|
.. | ||
_experimental | ||
analytics_endpoints | ||
anthropic_endpoints | ||
auth | ||
batches_endpoints | ||
common_utils | ||
config_management_endpoints | ||
credential_endpoints | ||
db | ||
example_config_yaml | ||
fine_tuning_endpoints | ||
guardrails | ||
health_endpoints | ||
hooks | ||
management_endpoints | ||
management_helpers | ||
middleware | ||
openai_files_endpoints | ||
pass_through_endpoints | ||
rerank_endpoints | ||
response_api_endpoints | ||
spend_tracking | ||
swagger | ||
types_utils | ||
ui_crud_endpoints | ||
vertex_ai_endpoints | ||
.gitignore | ||
__init__.py | ||
_logging.py | ||
_new_new_secret_config.yaml | ||
_new_secret_config.yaml | ||
_super_secret_config.yaml | ||
_types.py | ||
cached_logo.jpg | ||
caching_routes.py | ||
common_request_processing.py | ||
custom_prompt_management.py | ||
custom_sso.py | ||
custom_validate.py | ||
enterprise | ||
health_check.py | ||
lambda.py | ||
litellm_pre_call_utils.py | ||
llamaguard_prompt.txt | ||
logo.jpg | ||
mcp_tools.py | ||
model_config.yaml | ||
openapi.json | ||
post_call_rules.py | ||
prisma_migration.py | ||
proxy_cli.py | ||
proxy_config.yaml | ||
proxy_server.py | ||
README.md | ||
route_llm_request.py | ||
schema.prisma | ||
start.sh | ||
utils.py |
litellm-proxy
A local, fast, and lightweight OpenAI-compatible server to call 100+ LLM APIs.
usage
$ pip install litellm
$ litellm --model ollama/codellama
#INFO: Ollama running on http://0.0.0.0:8000
replace openai base
import openai # openai v1.0.0+
client = openai.OpenAI(api_key="anything",base_url="http://0.0.0.0:8000") # set proxy to base_url
# request sent to model set on litellm proxy, `litellm --model`
response = client.chat.completions.create(model="gpt-3.5-turbo", messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
])
print(response)
See how to call Huggingface,Bedrock,TogetherAI,Anthropic, etc.
Folder Structure
Routes
proxy_server.py
- all openai-compatible routes -/v1/chat/completion
,/v1/embedding
+ model info routes -/v1/models
,/v1/model/info
,/v1/model_group_info
routes.health_endpoints/
-/health
,/health/liveliness
,/health/readiness
management_endpoints/key_management_endpoints.py
- all/key/*
routesmanagement_endpoints/team_endpoints.py
- all/team/*
routesmanagement_endpoints/internal_user_endpoints.py
- all/user/*
routesmanagement_endpoints/ui_sso.py
- all/sso/*
routes