Commit graph

170 commits

Author SHA1 Message Date
Krrish Dholakia
665fdfc788 feat(prisma_client.py): initial commit add prisma migration support to proxy 2025-03-19 14:26:59 -07:00
Ishaan Jaff
d963568970
(Bug fix) - running litellm proxy on wndows (#8735)
* fix running litellm on windows

* fix importing litellm

* _init_hypercorn_server

* linting fix

* TestProxyInitializationHelpers

* ci/cd run again

* ci/cd run again
2025-02-25 15:19:19 -08:00
Ishaan Jaff
55b938dd6e
(Infra/DB) - Allow running older litellm version when out of sync with current state of DB (#8695)
* fix check migration

* clean up should_update_prisma_schema

* update test

* db_migration_disable_update_check

* Check container logs for expected message

* db_migration_disable_update_check

* test_check_migration_out_of_sync

* test_should_update_prisma_schema

* db_migration_disable_update_check

* pip install aiohttp
2025-02-20 18:30:23 -08:00
Ishaan Jaff
9ac18caf24
uvicorn allow setting num workers (#7681) 2025-01-10 19:03:14 -08:00
Ishaan Jaff
2507c275f6
(proxy perf improvement) - use uvloop for higher RPS (10%-20% higher RPS) (#7662)
* uvicorn use uvloop

* fix uvloop==0.21.0

* add uvloop to pyproject

* test_completion_response_ratelimit_headers
2025-01-09 18:11:20 -08:00
Ishaan Jaff
cf60444916
(Feat) Add support for reading secrets from Hashicorp vault (#7497)
* HashicorpSecretManager

* test_hashicorp_secret_managerv

* use 1 helper initialize_secret_manager

* add HASHICORP_VAULT

* working config

* hcorp read_secret

* HashicorpSecretManager

* add secret_manager_testing

* use 1 folder for secret manager testing

* test_hashicorp_secret_manager_get_secret

* HashicorpSecretManager

* docs HCP secrets

* update folder name

* docs hcorp secret manager

* remove unused imports

* add conftest.py

* fix tests

* docs document env vars
2025-01-01 18:35:05 -08:00
Ishaan Jaff
c7f14e936a
(code quality) run ruff rule to ban unused imports (#7313)
* remove unused imports

* fix AmazonConverseConfig

* fix test

* fix import

* ruff check fixes

* test fixes

* fix testing

* fix imports
2024-12-19 12:33:42 -08:00
Krish Dholakia
019ddc32d6
Litellm dev 12 17 2024 p2 (#7277)
* fix(openai/transcription/handler.py): call 'log_pre_api_call' on async calls

* fix(openai/transcriptions/handler.py): call 'logging.pre_call' on sync whisper calls as well

* fix(proxy_cli.py): remove default proxy_cli timeout param

gets passed in as a dynamic request timeout and overrides config values

* fix(langfuse.py): pass litellm httpx client - contains ssl certs (#7052)

Fixes https://github.com/BerriAI/litellm/issues/7046
2024-12-17 14:05:14 -08:00
Krish Dholakia
b82add11ba
LITELLM: Remove requests library usage (#7235)
* fix(generic_api_callback.py): remove requests lib usage

* fix(budget_manager.py): remove requests lib usgae

* fix(main.py): cleanup requests lib usage

* fix(utils.py): remove requests lib usage

* fix(argilla.py): fix argilla test

* fix(athina.py): replace 'requests' lib usage with litellm module

* fix(greenscale.py): replace 'requests' lib usage with httpx

* fix: remove unused 'requests' lib import + replace usage in some places

* fix(prompt_layer.py): remove 'requests' lib usage from prompt layer

* fix(ollama_chat.py): remove 'requests' lib usage

* fix(baseten.py): replace 'requests' lib usage

* fix(codestral/): replace 'requests' lib usage

* fix(predibase/): replace 'requests' lib usage

* refactor: cleanup unused 'requests' lib imports

* fix(oobabooga.py): cleanup 'requests' lib usage

* fix(invoke_handler.py): remove unused 'requests' lib usage

* refactor: cleanup unused 'requests' lib import

* fix: fix linting errors

* refactor(ollama/): move ollama to using base llm http handler

removes 'requests' lib dep for ollama integration

* fix(ollama_chat.py): fix linting errors

* fix(ollama/completion/transformation.py): convert non-jpeg/png image to jpeg/png before passing to ollama
2024-12-17 12:50:04 -08:00
Ishaan Jaff
f8e700064e
(Feat) Add support for storing virtual keys in AWS SecretManager (#6728)
* add SecretManager to httpxSpecialProvider

* fix importing AWSSecretsManagerV2

* add unit testing for writing keys to AWS secret manager

* use KeyManagementEventHooks for key/generated events

* us event hooks for key management endpoints

* working AWSSecretsManagerV2

* fix write secret to AWS secret manager on /key/generate

* fix KeyManagementSettings

* use tasks for key management hooks

* add async_delete_secret

* add test for async_delete_secret

* use _delete_virtual_keys_from_secret_manager

* fix test secret manager

* test_key_generate_with_secret_manager_call

* fix check for key_management_settings

* sync_read_secret

* test_aws_secret_manager

* fix sync_read_secret

* use helper to check when _should_read_secret_from_secret_manager

* test_get_secret_with_access_mode

* test - handle eol model claude-2, use claude-2.1 instead

* docs AWS secret manager

* fix test_read_nonexistent_secret

* fix test_supports_response_schema

* ci/cd run again
2024-11-14 09:25:07 -08:00
Krish Dholakia
0c204d33bc
LiteLLM Minor Fixes & Improvements (11/06/2024) (#6624)
* refactor(proxy_server.py): add debug logging around license check event (refactor position in startup_event logic)

* fix(proxy/_types.py): allow admin_allowed_routes to be any str

* fix(router.py): raise 400-status code error for no 'model_name' error on router

Fixes issue with status code when unknown model name passed with pattern matching enabled

* fix(converse_handler.py): add claude 3-5 haiku to bedrock converse models

* test: update testing to replace claude-instant-1.2

* fix(router.py): fix router.moderation calls

* test: update test to remove claude-instant-1

* fix(router.py): support model_list values in router.moderation

* test: fix test

* test: fix test
2024-11-07 04:37:32 +05:30
Ishaan Jaff
807e9dcea8
(docs + testing) Correctly document the timeout value used by litellm proxy is 6000 seconds + add to best practices for prod (#6339)
* fix docs use documented timeout

* document request timeout

* add test for litellm.request_timeout

* add test for checking value of timeout
2024-10-23 14:09:35 +05:30
Krish Dholakia
2b9db05e08
feat(proxy_cli.py): add new 'log_config' cli param (#6352)
* feat(proxy_cli.py): add new 'log_config' cli param

Allows passing logging.conf to uvicorn on startup

* docs(cli.md): add logging conf to uvicorn cli docs

* fix(get_llm_provider_logic.py): fix default api base for litellm_proxy

Fixes https://github.com/BerriAI/litellm/issues/6332

* feat(openai_like/embedding): Add support for jina ai embeddings

Closes https://github.com/BerriAI/litellm/issues/6337

* docs(deploy.md): update entrypoint.sh filepath post-refactor

Fixes outdated docs

* feat(prometheus.py): emit time_to_first_token metric on prometheus

Closes https://github.com/BerriAI/litellm/issues/6334

* fix(prometheus.py): only emit time to first token metric if stream is True

enables more accurate ttft usage

* test: handle vertex api instability

* fix(get_llm_provider_logic.py): fix import

* fix(openai.py): fix deepinfra default api base

* fix(anthropic/transformation.py): remove anthropic beta header (#6361)
2024-10-21 21:25:58 -07:00
Ishaan Jaff
610974b4fc
(code quality) add ruff check PLR0915 for too-many-statements (#6309)
* ruff add PLR0915

* add noqa for PLR0915

* fix noqa

* add # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* add # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915

* # noqa: PLR0915
2024-10-18 15:36:49 +05:30
Krish Dholakia
04e5963b65
Litellm expose disable schema update flag (#6085)
* fix: enable new 'disable_prisma_schema_update' flag

* build(config.yml): remove setup remote docker step

* ci(config.yml): give container time to start up

* ci(config.yml): update test

* build(config.yml): actually start docker

* build(config.yml): simplify grep check

* fix(prisma_client.py): support reading disable_schema_update via env vars

* ci(config.yml): add test to check if all general settings are documented

* build(test_General_settings.py): check available dir

* ci: check ../ repo path

* build: check ./

* build: fix test
2024-10-05 21:26:51 -04:00
Krish Dholakia
14165d3648
LiteLLM Minor Fixes & Improvements (10/02/2024) (#6023)
* feat(together_ai/completion): handle together ai completion calls

* fix: handle list of int / list of list of int for text completion calls

* fix(utils.py): check if base model in bedrock converse model list

Fixes https://github.com/BerriAI/litellm/issues/6003

* test(test_optional_params.py): add unit tests for bedrock optional param mapping

Fixes https://github.com/BerriAI/litellm/issues/6003

* feat(utils.py): enable passing dummy tool call for anthropic/bedrock calls if tool_use blocks exist

Fixes https://github.com/BerriAI/litellm/issues/5388

* fixed an issue with tool use of claude models with anthropic and bedrock (#6013)

* fix(utils.py): handle empty schema for anthropic/bedrock

Fixes https://github.com/BerriAI/litellm/issues/6012

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting errors

* fix(proxy_cli.py): fix import route for app + health checks path (#6026)

* (testing): Enable testing us.anthropic.claude-3-haiku-20240307-v1:0. (#6018)

* fix(proxy_cli.py): fix import route for app + health checks gettsburg.wav

Fixes https://github.com/BerriAI/litellm/issues/5999

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>

---------

Co-authored-by: Ved Patwardhan <54766411+vedpatwardhan@users.noreply.github.com>
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
2024-10-02 22:00:28 -04:00
Krish Dholakia
d57be47b0f
Litellm ruff linting enforcement (#5992)
* ci(config.yml): add a 'check_code_quality' step

Addresses https://github.com/BerriAI/litellm/issues/5991

* ci(config.yml): check why circle ci doesn't pick up this test

* ci(config.yml): fix to run 'check_code_quality' tests

* fix(__init__.py): fix unprotected import

* fix(__init__.py): don't remove unused imports

* build(ruff.toml): update ruff.toml to ignore unused imports

* fix: fix: ruff + pyright - fix linting + type-checking errors

* fix: fix linting errors

* fix(lago.py): fix module init error

* fix: fix linting errors

* ci(config.yml): cd into correct dir for checks

* fix(proxy_server.py): fix linting error

* fix(utils.py): fix bare except

causes ruff linting errors

* fix: ruff - fix remaining linting errors

* fix(clickhouse.py): use standard logging object

* fix(__init__.py): fix unprotected import

* fix: ruff - fix linting errors

* fix: fix linting errors

* ci(config.yml): cleanup code qa step (formatting handled in local_testing)

* fix(_health_endpoints.py): fix ruff linting errors

* ci(config.yml): just use ruff in check_code_quality pipeline for now

* build(custom_guardrail.py): include missing file

* style(embedding_handler.py): fix ruff check
2024-10-01 19:44:20 -04:00
Ishaan Jaff
0d0f46a826
[Feat Proxy] Allow using hypercorn for http v2 (#5950)
* use run_hypercorn

* add docs on using hypercorn
2024-09-28 15:03:50 -07:00
Krish Dholakia
a1d9e96b31
LiteLLM Minor Fixes & Improvements (09/25/2024) (#5893)
* fix(langfuse.py): support new langfuse prompt_chat class init params

* fix(langfuse.py): handle new init values on prompt chat + prompt text templates

fixes error caused during langfuse logging

* docs(openai_compatible.md): clarify `openai/` handles correct routing for `/v1/completions` route

Fixes https://github.com/BerriAI/litellm/issues/5876

* fix(utils.py): handle unmapped gemini model optional param translation

Fixes https://github.com/BerriAI/litellm/issues/5888

* fix(o1_transformation.py): fix o-1 validation, to not raise error if temperature=1

Fixes https://github.com/BerriAI/litellm/issues/5884

* fix(prisma_client.py): refresh iam token

Fixes https://github.com/BerriAI/litellm/issues/5896

* fix: pass drop params where required

* fix(utils.py): pass drop_params correctly

* fix(types/vertex_ai.py): fix generation config

* test(test_max_completion_tokens.py): fix test

* fix(vertex_and_google_ai_studio_gemini.py): fix map openai params
2024-09-26 16:41:44 -07:00
steffen-sbt
de9a39e7c6
Add the option to specify a schema in the postgres DB, also modify docs (#5640) 2024-09-11 14:53:52 -07:00
Ishaan Jaff
3c898e23ea refactor secret managers 2024-09-03 10:58:02 -07:00
Ishaan Jaff
b0178a85cf refactor get_secret 2024-09-03 10:42:12 -07:00
Krrish Dholakia
8625663458 feat(proxy_server.py): support azure batch api endpoints 2024-08-22 15:21:43 -07:00
Krrish Dholakia
8ce8680a9a fix(proxy_cli.py): support database_host, database_username, database_password, database_name 2024-08-19 16:17:45 -07:00
Krish Dholakia
036a6821d5
Merge pull request #5057 from BerriAI/litellm_rds_iam_auth
feat(proxy_cli.py): support iam-based auth to rds
2024-08-06 10:44:33 -07:00
Krrish Dholakia
1cc7c7fc59 feat(proxy_cli.py): support iam-based auth to rds
Initial pr for iam-based auth support for rds
2024-08-05 17:35:48 -07:00
Krrish Dholakia
936640948d fix: bump default allowed_fails + reduce default db pool limit
Fixes issues with running proxy server in production
2024-08-05 15:07:46 -07:00
Krrish Dholakia
fe62e4e1c4 fix(proxy_cli.py): bump default azure api version 2024-07-08 16:28:22 -07:00
Krrish Dholakia
b84d335624 fix(proxy_cli.py): run aws kms decrypt before starting proxy server 2024-06-28 16:03:56 -07:00
Ishaan Jaff
aa3c14fa46 make sure linting runs proxy_cli.py 2024-06-20 20:20:08 -07:00
Chris Van Pelt
306c2b425d
Update proxy_cli.py
Fixed indentation to so we don't get an `UnboundLocalError`.  Fixes #4324
2024-06-20 17:48:16 -07:00
Krrish Dholakia
248ee488f0 fix(proxy_cli.py): fix double counting json logs 2024-06-20 15:15:23 -07:00
Krrish Dholakia
e4dbb9b2db fix(proxy_cli.py): support passing the database url as an encrypted kms key 2024-06-10 15:48:27 -07:00
Krrish Dholakia
0d3e52373c fix(proxy/_logging.py): fix default logging level 2024-06-05 17:42:49 -07:00
Krrish Dholakia
3167bee25a fix(proxy_cli.py): enable json logging via litellm_settings param on config
allows user to enable json logs without needing to figure out env variables
2024-05-29 21:41:20 -07:00
Krrish Dholakia
058bfb101d feat(proxy_cli.py): support json logs on proxy
allow user to enable 'json logs' for proxy server
2024-05-20 09:18:12 -07:00
Krrish Dholakia
9eee2f3889 docs(prod.md): add 'disable load_dotenv' tutorial to docs 2024-05-14 19:13:22 -07:00
Krrish Dholakia
1ab4974773 fix: disable 'load_dotenv' for prod environments 2024-05-14 19:09:36 -07:00
Krrish Dholakia
6575143460 feat(proxy_server.py): return litellm version in response headers 2024-05-08 16:00:08 -07:00
Krrish Dholakia
b2741933dc fix(proxy_cli.py): don't double load the router config
was causing callbacks to be instantiated twice - double couting usage in cache
2024-04-10 13:23:56 -07:00
Krrish Dholakia
6d32323e3d fix(proxy_cli.py): revert db timeout change - user-controllable param
db timeout is a user controllable param, not necessary to change defaults
2024-04-03 09:37:57 -07:00
Krrish Dholakia
f07500c5ea fix(proxy_server.py): bump default db timeouts 2024-04-03 09:35:08 -07:00
Krrish Dholakia
1e856443e1 feat(proxy/utils.py): enable updating db in a separate server 2024-03-27 16:02:36 -07:00
Krrish Dholakia
b204f0c01c fix(proxy_cli.py): fix circular import issue 2024-03-26 21:16:41 -07:00
Krrish Dholakia
6d418a2920 fix(llm_guard.py): working llm-guard 'key-specific' mode 2024-03-26 17:47:20 -07:00
Krish Dholakia
9d7aceb06e
Merge pull request #2697 from antoniomdk/fix-database-credentials-leakage
(fix) Remove print statements from append_query_params
2024-03-26 16:07:33 -07:00
Ishaan Jaff
6b4b05b58f (fix) remove litellm.telemetry 2024-03-26 11:21:09 -07:00
Antonio Molner Domenech
c713648db1 Update print statements to use verbose logger and DEBUG level 2024-03-26 22:41:28 +07:00
Ishaan Jaff
3ad6e5ffc1 (feat) start proxy with default num_workers=1 2024-03-20 10:46:32 -07:00
ishaan-jaff
ea6f42216c (docs) use port 4000 2024-03-08 21:59:00 -08:00