Krish Dholakia
8ee32291e0
Squashed commit of the following: ( #9709 )
...
commit b12a9892b7
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Wed Apr 2 08:09:56 2025 -0700
fix(utils.py): don't modify openai_token_counter
commit 294de31803
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Mar 24 21:22:40 2025 -0700
fix: fix linting error
commit cb6e9fbe40
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Mar 24 19:52:45 2025 -0700
refactor: complete migration
commit bfc159172d
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Mar 24 19:09:59 2025 -0700
refactor: refactor more constants
commit 43ffb6a558
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Mar 24 18:45:24 2025 -0700
fix: test
commit 04dbe4310c
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Mar 24 18:28:58 2025 -0700
refactor: refactor: move more constants into constants.py
commit 3c26284aff
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Mar 24 18:14:46 2025 -0700
refactor: migrate hardcoded constants out of __init__.py
commit c11e0de69d
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Mar 24 18:11:21 2025 -0700
build: migrate all constants into constants.py
commit 7882bdc787
Author: Krrish Dholakia <krrishdholakia@gmail.com>
Date: Mon Mar 24 18:07:37 2025 -0700
build: initial test banning hardcoded numbers in repo
2025-04-02 21:24:54 -07:00
Krish Dholakia
9b7ebb6a7d
build(pyproject.toml): add new dev dependencies - for type checking ( #9631 )
...
* build(pyproject.toml): add new dev dependencies - for type checking
* build: reformat files to fit black
* ci: reformat to fit black
* ci(test-litellm.yml): make tests run clear
* build(pyproject.toml): add ruff
* fix: fix ruff checks
* build(mypy/): fix mypy linting errors
* fix(hashicorp_secret_manager.py): fix passing cert for tls auth
* build(mypy/): resolve all mypy errors
* test: update test
* fix: fix black formatting
* build(pre-commit-config.yaml): use poetry run black
* fix(proxy_server.py): fix linting error
* fix: fix ruff safe representation error
2025-03-29 11:02:13 -07:00
Ishaan Jaff
081826a5d6
(Feat) soft budget alerts on keys ( #7623 )
...
* class WebhookEvent(CallInfo):
Add
* handle soft budget alerts
* handle soft budget
* fix budget alerts
* fix CallInfo
* fix _get_user_info_str
* test_soft_budget_alerts
* test_soft_budget_alert
2025-01-07 21:36:34 -08:00
Ishaan Jaff
d1b101b9d7
(Fix) - Slack Alerting , don't send duplicate spend report when used on multi instance settings ( #7546 )
...
* fix send_weekly_spend_report
* test_spend_report_cache
2025-01-04 10:54:35 -08:00
Ishaan Jaff
1bb4941036
[Feature]: - allow print alert log to console ( #7534 )
...
* update send_to_webhook
* test_print_alerting_payload_warning
* add alerting_args spec
* test_alerting.py
2025-01-03 17:48:13 -08:00
Ishaan Jaff
03b1db5a7d
(Feat) - Add PagerDuty Alerting Integration ( #7478 )
...
* define basic types
* fix verbose_logger.exception statement
* fix basic alerting
* test pager duty alerting
* test_pagerduty_alerting_high_failure_rate
* PagerDutyAlerting
* async_log_failure_event
* use pre_call_hook
* add _request_is_completed helper util
* update AlertingConfig
* rename PagerDutyInternalEvent
* _send_alert_if_thresholds_crossed
* use pagerduty as _custom_logger_compatible_callbacks_literal
* fix slack alerting imports
* fix imports in slack alerting
* PagerDutyAlerting
* fix _load_alerting_settings
* test_pagerduty_hanging_request_alerting
* working pager duty alerting
* fix linting
* doc pager duty alerting
* update hanging_response_handler
* fix import location
* update failure_threshold
* update async_pre_call_hook
* docs pagerduty
* test - callback_class_str_to_classType
* fix linting errors
* fix linting + testing error
* PagerDutyAlerting
* test_pagerduty_hanging_request_alerting
* fix unused imports
* docs pager duty
* @pytest.mark.flaky(retries=6, delay=2)
* test_model_info_bedrock_converse_enforcement
2025-01-01 07:12:51 -08:00
Ishaan Jaff
c7f14e936a
(code quality) run ruff rule to ban unused imports ( #7313 )
...
* remove unused imports
* fix AmazonConverseConfig
* fix test
* fix import
* ruff check fixes
* test fixes
* fix testing
* fix imports
2024-12-19 12:33:42 -08:00
Krish Dholakia
516c2a6a70
Litellm remove circular imports ( #7232 )
...
* fix(utils.py): initial commit to remove circular imports - moves llmproviders to utils.py
* fix(router.py): fix 'litellm.EmbeddingResponse' import from router.py
'
* refactor: fix litellm.ModelResponse import on pass through endpoints
* refactor(litellm_logging.py): fix circular import for custom callbacks literal
* fix(factory.py): fix circular imports inside prompt factory
* fix(cost_calculator.py): fix circular import for 'litellm.Usage'
* fix(proxy_server.py): fix potential circular import with `litellm.Router'
* fix(proxy/utils.py): fix potential circular import in `litellm.Router`
* fix: remove circular imports in 'auth_checks' and 'guardrails/'
* fix(prompt_injection_detection.py): fix router impor t
* fix(vertex_passthrough_logging_handler.py): fix potential circular imports in vertex pass through
* fix(anthropic_pass_through_logging_handler.py): fix potential circular imports
* fix(slack_alerting.py-+-ollama_chat.py): fix modelresponse import
* fix(base.py): fix potential circular import
* fix(handler.py): fix potential circular ref in codestral + cohere handler's
* fix(azure.py): fix potential circular imports
* fix(gpt_transformation.py): fix modelresponse import
* fix(litellm_logging.py): add logging base class - simplify typing
makes it easy for other files to type check the logging obj without introducing circular imports
* fix(azure_ai/embed): fix potential circular import on handler.py
* fix(databricks/): fix potential circular imports in databricks/
* fix(vertex_ai/): fix potential circular imports on vertex ai embeddings
* fix(vertex_ai/image_gen): fix import
* fix(watsonx-+-bedrock): cleanup imports
* refactor(anthropic-pass-through-+-petals): cleanup imports
* refactor(huggingface/): cleanup imports
* fix(ollama-+-clarifai): cleanup circular imports
* fix(openai_like/): fix impor t
* fix(openai_like/): fix embedding handler
cleanup imports
* refactor(openai.py): cleanup imports
* fix(sagemaker/transformation.py): fix import
* ci(config.yml): add circular import test to ci/cd
2024-12-14 16:28:34 -08:00
Krish Dholakia
9160d80fa5
LiteLLM Minor Fixes & Improvements (11/12/2024) ( #6705 )
...
* fix(caching): convert arg to equivalent kwargs in llm caching handler
prevent unexpected errors
* fix(caching_handler.py): don't pass args to caching
* fix(caching): remove all *args from caching.py
* fix(caching): consistent function signatures + abc method
* test(caching_unit_tests.py): add unit tests for llm caching
ensures coverage for common caching scenarios across different implementations
* refactor(litellm_logging.py): move to using cache key from hidden params instead of regenerating one
* fix(router.py): drop redis password requirement
* fix(proxy_server.py): fix faulty slack alerting check
* fix(langfuse.py): avoid copying functions/thread lock objects in metadata
fixes metadata copy error when parent otel span in metadata
* test: update test
2024-11-12 22:50:51 +05:30
Ishaan Jaff
9545b0e5cd
(fix) slack alerting - don't spam the failed cost tracking alert for the same model ( #6543 )
...
* fix use failing_model as cache key for failed_tracking_alert
* fix use standard logging payload for getting response cost
* fix kwargs.get("response_cost")
* fix getting response cost
2024-11-01 18:36:17 +05:30
Krish Dholakia
4f8a3fd4cf
redis otel tracing + async support for latency routing ( #6452 )
...
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* refactor: pass parent_otel_span for redis caching calls in router
allows for more observability into what calls are causing latency issues
* test: update tests with new params
* refactor: ensure e2e otel tracing for router
* refactor(router.py): add more otel tracing acrosss router
catch all latency issues for router requests
* fix: fix linting error
* fix(router.py): fix linting error
* fix: fix test
* test: fix tests
* fix(dual_cache.py): pass ttl to redis cache
* fix: fix param
2024-10-28 21:52:12 -07:00
Ishaan Jaff
610974b4fc
(code quality) add ruff check PLR0915 for too-many-statements
( #6309 )
...
* ruff add PLR0915
* add noqa for PLR0915
* fix noqa
* add # noqa: PLR0915
* # noqa: PLR0915
* # noqa: PLR0915
* # noqa: PLR0915
* add # noqa: PLR0915
* # noqa: PLR0915
* # noqa: PLR0915
* # noqa: PLR0915
* # noqa: PLR0915
2024-10-18 15:36:49 +05:30
Ishaan Jaff
4d1b4beb3d
(refactor) caching use LLMCachingHandler for async_get_cache and set_cache ( #6208 )
...
* use folder for caching
* fix importing caching
* fix clickhouse pyright
* fix linting
* fix correctly pass kwargs and args
* fix test case for embedding
* fix linting
* fix embedding caching logic
* fix refactor handle utils.py
* fix test_embedding_caching_azure_individual_items_reordered
2024-10-14 16:34:01 +05:30
Krish Dholakia
fac3b2ee42
Add pyright to ci/cd + Fix remaining type-checking errors ( #6082 )
...
* fix: fix type-checking errors
* fix: fix additional type-checking errors
* fix: additional type-checking error fixes
* fix: fix additional type-checking errors
* fix: additional type-check fixes
* fix: fix all type-checking errors + add pyright to ci/cd
* fix: fix incorrect import
* ci(config.yml): use mypy on ci/cd
* fix: fix type-checking errors in utils.py
* fix: fix all type-checking errors on main.py
* fix: fix mypy linting errors
* fix(anthropic/cost_calculator.py): fix linting errors
* fix: fix mypy linting errors
* fix: fix linting errors
2024-10-05 17:04:00 -04:00
Ishaan Jaff
1ab886f80d
(contributor PRs) oct 3rd, 2024 ( #6034 )
...
* Do not skip important tests for OIDC. (#6017 )
* [Bug] Skip monthly slack alert if there was no spend (#6015 )
* Fix: skip slack alert if there was no spend
* Skip monthly report when there was no spend
---------
Co-authored-by: María Paz Cuturi <paz@MacBook-Pro-de-Paz.local>
---------
Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
Co-authored-by: Paz <paz@tryolabs.com>
Co-authored-by: María Paz Cuturi <paz@MacBook-Pro-de-Paz.local>
2024-10-03 17:12:34 +05:30
Krish Dholakia
d57be47b0f
Litellm ruff linting enforcement ( #5992 )
...
* ci(config.yml): add a 'check_code_quality' step
Addresses https://github.com/BerriAI/litellm/issues/5991
* ci(config.yml): check why circle ci doesn't pick up this test
* ci(config.yml): fix to run 'check_code_quality' tests
* fix(__init__.py): fix unprotected import
* fix(__init__.py): don't remove unused imports
* build(ruff.toml): update ruff.toml to ignore unused imports
* fix: fix: ruff + pyright - fix linting + type-checking errors
* fix: fix linting errors
* fix(lago.py): fix module init error
* fix: fix linting errors
* ci(config.yml): cd into correct dir for checks
* fix(proxy_server.py): fix linting error
* fix(utils.py): fix bare except
causes ruff linting errors
* fix: ruff - fix remaining linting errors
* fix(clickhouse.py): use standard logging object
* fix(__init__.py): fix unprotected import
* fix: ruff - fix linting errors
* fix: fix linting errors
* ci(config.yml): cleanup code qa step (formatting handled in local_testing)
* fix(_health_endpoints.py): fix ruff linting errors
* ci(config.yml): just use ruff in check_code_quality pipeline for now
* build(custom_guardrail.py): include missing file
* style(embedding_handler.py): fix ruff check
2024-10-01 19:44:20 -04:00
Ishaan Jaff
045ecf3ffb
(feat proxy slack alerting) - allow opting in to getting key / internal user alerts ( #5990 )
...
* define all slack alert types
* use correct type hints for alert type
* use correct defaults on slack alerting
* add readme for slack alerting
* fix linting error
* update readme
* docs all alert types
* update slack alerting docs
* fix slack alerting docs
* handle new testing dir structure
* fix config for testing
* fix testing folder related imports
* fix /tests import errors
* fix import stream_chunk_testdata
* docs alert types
* fix test test_langfuse_trace_id
* fix type checks for slack alerting
* fix outage alerting test slack
2024-10-01 10:49:22 -07:00
Paz
8225880af0
Fix: skip slack alert if there was no spend ( #5998 )
...
Co-authored-by: María Paz Cuturi <paz@MacBook-Pro-de-Paz.local>
2024-10-01 08:02:16 -07:00
Krish Dholakia
d37c8b5c6b
LiteLLM Minor Fixes & Improvements (09/23/2024) ( #5842 ) ( #5858 )
...
* LiteLLM Minor Fixes & Improvements (09/23/2024) (#5842 )
* feat(auth_utils.py): enable admin to allow client-side credentials to be passed
Makes it easier for devs to experiment with finetuned fireworks ai models
* feat(router.py): allow setting configurable_clientside_auth_params for a model
Closes https://github.com/BerriAI/litellm/issues/5843
* build(model_prices_and_context_window.json): fix anthropic claude-3-5-sonnet max output token limit
Fixes https://github.com/BerriAI/litellm/issues/5850
* fix(azure_ai/): support content list for azure ai
Fixes https://github.com/BerriAI/litellm/issues/4237
* fix(litellm_logging.py): always set saved_cache_cost
Set to 0 by default
* fix(fireworks_ai/cost_calculator.py): add fireworks ai default pricing
handles calling 405b+ size models
* fix(slack_alerting.py): fix error alerting for failed spend tracking
Fixes regression with slack alerting error monitoring
* fix(vertex_and_google_ai_studio_gemini.py): handle gemini no candidates in streaming chunk error
* docs(bedrock.md): add llama3-1 models
* test: fix tests
* fix(azure_ai/chat): fix transformation for azure ai calls
2024-09-24 15:01:31 -07:00
Krish Dholakia
9c8fdee068
Additional Fixes (09/17/2024) ( #5759 )
...
* fix(auth_checks.py): check if key has all model access via wildcard routing
Fixes issue where key with `openai/*` couldn't call gpt models
* fix(slack_alerting.py): expose flag for disabling failed spend tracking alerts
2024-09-17 23:02:12 -07:00
Krish Dholakia
234185ec13
LiteLLM Minor Fixes & Improvements (09/16/2024) ( #5723 ) ( #5731 )
...
* LiteLLM Minor Fixes & Improvements (09/16/2024) (#5723 )
* coverage (#5713 )
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* Move (#5714 )
Signed-off-by: dbczumar <corey.zumar@databricks.com>
* fix(litellm_logging.py): fix logging client re-init (#5710 )
Fixes https://github.com/BerriAI/litellm/issues/5695
* fix(presidio.py): Fix logging_hook response and add support for additional presidio variables in guardrails config
Fixes https://github.com/BerriAI/litellm/issues/5682
* feat(o1_handler.py): fake streaming for openai o1 models
Fixes https://github.com/BerriAI/litellm/issues/5694
* docs: deprecated traceloop integration in favor of native otel (#5249 )
* fix: fix linting errors
* fix: fix linting errors
* fix(main.py): fix o1 import
---------
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view (#5730 )
* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view
Supports having `MonthlyGlobalSpend` view be a material view, and exposes an endpoint to refresh it
* fix(custom_logger.py): reset calltype
* fix: fix linting errors
* fix: fix linting error
* fix: fix import
* test(test_databricks.py): fix databricks tests
---------
Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
2024-09-17 08:05:52 -07:00
Ishaan Jaff
b6ae2204a8
[Feat-Proxy] Slack Alerting - allow using os.environ/ vars for alert to webhook url ( #5726 )
...
* allow using os.environ for slack urls
* use env vars for webhook urls
* fix types for get_secret
* fix linting
* fix linting
* fix linting
* linting fixes
* linting fix
* docs alerting slack
* fix get data
2024-09-16 18:03:37 -07:00
Krish Dholakia
60709a0753
LiteLLM Minor Fixes and Improvements (09/13/2024) ( #5689 )
...
* refactor: cleanup unused variables + fix pyright errors
* feat(health_check.py): Closes https://github.com/BerriAI/litellm/issues/5686
* fix(o1_reasoning.py): add stricter check for o-1 reasoning model
* refactor(mistral/): make it easier to see mistral transformation logic
* fix(openai.py): fix openai o-1 model param mapping
Fixes https://github.com/BerriAI/litellm/issues/5685
* feat(main.py): infer finetuned gemini model from base model
Fixes https://github.com/BerriAI/litellm/issues/5678
* docs(vertex.md): update docs to call finetuned gemini models
* feat(proxy_server.py): allow admin to hide proxy model aliases
Closes https://github.com/BerriAI/litellm/issues/5692
* docs(load_balancing.md): add docs on hiding alias models from proxy config
* fix(base.py): don't raise notimplemented error
* fix(user_api_key_auth.py): fix model max budget check
* fix(router.py): fix elif
* fix(user_api_key_auth.py): don't set team_id to empty str
* fix(team_endpoints.py): fix response type
* test(test_completion.py): handle predibase error
* test(test_proxy_server.py): fix test
* fix(o1_transformation.py): fix max_completion_token mapping
* test(test_image_generation.py): mark flaky test
2024-09-14 10:02:55 -07:00
Ishaan Jaff
e7c9716841
[Feat-Perf] Use Batching + Squashing ( #5645 )
...
* use folder for slack alerting
* clean up slack alerting
* fix test alerting
2024-09-12 18:37:53 -07:00