Krish Dholakia
2f08341a08
Litellm dev readd prompt caching ( #7299 )
...
* fix(router.py): re-add saving model id on prompt caching valid successful deployment
* fix(router.py): introduce optional pre_call_checks
isolate prompt caching logic in a separate file
* fix(prompt_caching_deployment_check.py): fix import
* fix(router.py): new 'async_filter_deployments' event hook
allows custom logger to filter deployments returned to routing strategy
* feat(prompt_caching_deployment_check.py): initial working commit of prompt caching based routing
* fix(cooldown_callbacks.py): fix linting error
* fix(budget_limiter.py): move budget logger to async_filter_deployment hook
* test: add unit test
* test(test_router_helper_utils.py): add unit testing
* fix(budget_limiter.py): fix linting errors
* docs(config_settings.md): add 'optional_pre_call_checks' to router_settings param docs
2024-12-18 15:13:49 -08:00
Krish Dholakia
4f8a3fd4cf
redis otel tracing + async support for latency routing ( #6452 )
...
* docs(exception_mapping.md): add missing exception types
Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183
* fix(main.py): register custom model pricing with specific key
Ensure custom model pricing is registered to the specific model+provider key combination
* test: make testing more robust for custom pricing
* fix(redis_cache.py): instrument otel logging for sync redis calls
ensures complete coverage for all redis cache calls
* refactor: pass parent_otel_span for redis caching calls in router
allows for more observability into what calls are causing latency issues
* test: update tests with new params
* refactor: ensure e2e otel tracing for router
* refactor(router.py): add more otel tracing acrosss router
catch all latency issues for router requests
* fix: fix linting error
* fix(router.py): fix linting error
* fix: fix test
* test: fix tests
* fix(dual_cache.py): pass ttl to redis cache
* fix: fix param
2024-10-28 21:52:12 -07:00
Krish Dholakia
905ebeb924
feat(custom_logger.py): expose new async_dataset_hook
for modifying… ( #6331 )
...
* feat(custom_logger.py): expose new `async_dataset_hook` for modifying/rejecting argilla items before logging
Allows user more control on what gets logged to argilla for annotations
* feat(google_ai_studio_endpoints.py): add new `/azure/*` pass through route
enables pass-through for azure provider
* feat(utils.py): support checking ollama `/api/show` endpoint for retrieving ollama model info
Fixes https://github.com/BerriAI/litellm/issues/6322
* fix(user_api_key_auth.py): add `/key/delete` to an allowed_ui_routes
Fixes https://github.com/BerriAI/litellm/issues/6236
* fix(user_api_key_auth.py): remove type ignore
* fix(user_api_key_auth.py): route ui vs. api token checks differently
Fixes https://github.com/BerriAI/litellm/issues/6238
* feat(internal_user_endpoints.py): support setting models as a default internal user param
Closes https://github.com/BerriAI/litellm/issues/6239
* fix(user_api_key_auth.py): fix exception string
* fix(user_api_key_auth.py): fix error string
* fix: fix test
2024-10-20 09:00:04 -07:00
Ishaan Jaff
4d1b4beb3d
(refactor) caching use LLMCachingHandler for async_get_cache and set_cache ( #6208 )
...
* use folder for caching
* fix importing caching
* fix clickhouse pyright
* fix linting
* fix correctly pass kwargs and args
* fix test case for embedding
* fix linting
* fix embedding caching logic
* fix refactor handle utils.py
* fix test_embedding_caching_azure_individual_items_reordered
2024-10-14 16:34:01 +05:30
Krish Dholakia
fac3b2ee42
Add pyright to ci/cd + Fix remaining type-checking errors ( #6082 )
...
* fix: fix type-checking errors
* fix: fix additional type-checking errors
* fix: additional type-checking error fixes
* fix: fix additional type-checking errors
* fix: additional type-check fixes
* fix: fix all type-checking errors + add pyright to ci/cd
* fix: fix incorrect import
* ci(config.yml): use mypy on ci/cd
* fix: fix type-checking errors in utils.py
* fix: fix all type-checking errors on main.py
* fix: fix mypy linting errors
* fix(anthropic/cost_calculator.py): fix linting errors
* fix: fix mypy linting errors
* fix: fix linting errors
2024-10-05 17:04:00 -04:00
Krish Dholakia
d57be47b0f
Litellm ruff linting enforcement ( #5992 )
...
* ci(config.yml): add a 'check_code_quality' step
Addresses https://github.com/BerriAI/litellm/issues/5991
* ci(config.yml): check why circle ci doesn't pick up this test
* ci(config.yml): fix to run 'check_code_quality' tests
* fix(__init__.py): fix unprotected import
* fix(__init__.py): don't remove unused imports
* build(ruff.toml): update ruff.toml to ignore unused imports
* fix: fix: ruff + pyright - fix linting + type-checking errors
* fix: fix linting errors
* fix(lago.py): fix module init error
* fix: fix linting errors
* ci(config.yml): cd into correct dir for checks
* fix(proxy_server.py): fix linting error
* fix(utils.py): fix bare except
causes ruff linting errors
* fix: ruff - fix remaining linting errors
* fix(clickhouse.py): use standard logging object
* fix(__init__.py): fix unprotected import
* fix: ruff - fix linting errors
* fix: fix linting errors
* ci(config.yml): cleanup code qa step (formatting handled in local_testing)
* fix(_health_endpoints.py): fix ruff linting errors
* ci(config.yml): just use ruff in check_code_quality pipeline for now
* build(custom_guardrail.py): include missing file
* style(embedding_handler.py): fix ruff check
2024-10-01 19:44:20 -04:00
Ishaan Jaff
49ec40b1cb
(feat proxy prometheus) track virtual key, key alias, error code, error code class on prometheus ( #5968 )
...
* track api key and team in prom latency metric
* add test for latency metric
* test prometheus success metrics for latency
* track team and key labels for deployment failures
* add test for litellm_deployment_failure_responses_total
* fix checks for premium user on prometheus
* log_success_fallback_event and log_failure_fallback_event
* log original_exception in log_success_fallback_event
* track key, team and exception status and class on fallback metrics
* use get_standard_logging_metadata
* fix import error
* track litellm_deployment_successful_fallbacks
* add test test_proxy_fallback_metrics
* add log log_success_fallback_event
* fix test prometheus
2024-09-28 19:00:21 -07:00
Ishaan Jaff
91e58d9049
[Feat] Add proxy level prometheus metrics ( #5789 )
...
* add Proxy Level Tracking Metrics doc
* update service logger
* prometheus - track litellm_proxy_failed_requests_metric
* use REQUESTED_MODEL
* fix prom request_data
2024-09-19 17:13:07 -07:00
Ishaan Jaff
911230c434
[Feat-Proxy-DataDog] Log Redis, Postgres Failure events on DataDog ( #5750 )
...
* dd - start tracking redis status on dd
* add async_service_succes_hook / failure hook in custom logger
* add async_service_failure_hook
* log service failures on dd
* fix import error
* add test for redis errors / warning
2024-09-17 20:24:06 -07:00
Ishaan Jaff
b6ae2204a8
[Feat-Proxy] Slack Alerting - allow using os.environ/ vars for alert to webhook url ( #5726 )
...
* allow using os.environ for slack urls
* use env vars for webhook urls
* fix types for get_secret
* fix linting
* fix linting
* fix linting
* linting fixes
* linting fix
* docs alerting slack
* fix get data
2024-09-16 18:03:37 -07:00
Ishaan Jaff
715387c3c0
add message_logging on Custom Logger
2024-09-09 15:59:42 -07:00
Krish Dholakia
e0d81434ed
LiteLLM minor fixes + improvements (31/08/2024) ( #5464 )
...
* fix(vertex_endpoints.py): fix vertex ai pass through endpoints
* test(test_streaming.py): skip model due to end of life
* feat(custom_logger.py): add special callback for model hitting tpm/rpm limits
Closes https://github.com/BerriAI/litellm/issues/4096
2024-09-01 13:31:42 -07:00
Ishaan Jaff
fb5be57bb8
v0 add rerank on litellm proxy
2024-08-27 17:28:39 -07:00
Ishaan Jaff
4685b9909a
feat - allow accessing data post success call
2024-08-19 11:35:33 -07:00
Ishaan Jaff
dc0559226a
v0 add helper for loging success/fail fallback events
2024-08-10 13:26:39 -07:00
Krrish Dholakia
ac6c39c283
feat(anthropic_adapter.py): support streaming requests for /v1/messages
endpoint
...
Fixes https://github.com/BerriAI/litellm/issues/5011
2024-08-03 20:16:19 -07:00
Krrish Dholakia
0cc273d77b
feat(pass_through_endpoint.py): support enforcing key rpm limits on pass through endpoints
...
Closes https://github.com/BerriAI/litellm/issues/4698
2024-07-13 13:29:44 -07:00
Krish Dholakia
d72bcdbce3
Merge pull request #4669 from BerriAI/litellm_logging_only_masking
...
Flag for PII masking on Logging only
2024-07-11 22:03:37 -07:00
Krrish Dholakia
9d918d2ac7
fix(presidio_pii_masking.py): support logging_only pii masking
2024-07-11 18:04:12 -07:00
Krrish Dholakia
9deb9b4e3f
feat(guardrails): Flag for PII Masking on Logging
...
Fixes https://github.com/BerriAI/litellm/issues/4580
2024-07-11 16:09:34 -07:00
Krrish Dholakia
2f8dbbeb97
feat(proxy_server.py): working /v1/messages
endpoint
...
Works with claude engineer
2024-07-10 18:15:38 -07:00
Krrish Dholakia
5d6e172d5c
feat(anthropic_adapter.py): support for translating anthropic params to openai format
2024-07-10 00:32:28 -07:00
Krrish Dholakia
d98e00d1e0
fix(router.py): set cooldown_time:
per model
2024-06-25 16:51:55 -07:00
Nejc Habjan
2ecd614a73
fix: add more type hints to init methods
2024-06-18 12:09:39 +02:00
Krrish Dholakia
6cca5612d2
refactor: replace 'traceback.print_exc()' with logging library
...
allows error logs to be in json format for otel logging
2024-06-06 13:47:43 -07:00
Krrish Dholakia
f11f207ae6
feat(proxy_server.py): refactor returning rejected message, to work with error logging
...
log the rejected request as a failed call to langfuse/slack alerting
2024-05-20 11:14:36 -07:00
Krrish Dholakia
372323c38a
feat(proxy_server.py): allow admin to return rejected response as string to user
...
Closes https://github.com/BerriAI/litellm/issues/3671
2024-05-20 10:30:23 -07:00
Krrish Dholakia
4a3b084961
feat(bedrock_httpx.py): moves to using httpx client for bedrock cohere calls
2024-05-11 13:43:08 -07:00
Krrish Dholakia
6575143460
feat(proxy_server.py): return litellm version in response headers
2024-05-08 16:00:08 -07:00
Krrish Dholakia
81573b2dd9
fix(test_lowest_tpm_rpm_routing_v2.py): unit testing for usage-based-routing-v2
2024-04-18 21:38:00 -07:00
Krrish Dholakia
e10eb8f6fe
feat(llm_guard.py): enable key-specific llm guard check
2024-03-26 17:21:51 -07:00
Krrish Dholakia
d91f9a9f50
feat(proxy_server.py): enable llm api based prompt injection checks
...
run user calls through an llm api to check for prompt injection attacks. This happens in parallel to th
e actual llm call using `async_moderation_hook`
2024-03-20 22:43:42 -07:00
Krrish Dholakia
78d87a4fbd
fix: clean up print verbose statements
2024-03-05 15:01:03 -08:00
Krrish Dholakia
49847347d0
fix(llm_guard.py): add streaming hook for moderation calls
2024-02-20 20:31:32 -08:00
Krrish Dholakia
2a4a6995ac
feat(llama_guard.py): add llama guard support for content moderation + new async_moderation_hook
endpoint
2024-02-16 18:45:25 -08:00
Krrish Dholakia
59981a5a03
fix: fix merge issues
2024-02-13 23:04:12 -08:00
Krish Dholakia
f5c989cb83
Merge branch 'main' into litellm_fix_pii_output_parsing
2024-02-13 22:36:17 -08:00
Krrish Dholakia
f68b656040
feat(presidio_pii_masking.py): enable output parsing for pii masking
2024-02-13 21:36:57 -08:00
Krrish Dholakia
7600c8f41d
feat(utils.py): enable post call rules for streaming
2024-02-12 22:08:04 -08:00
Krrish Dholakia
4905929de3
refactor: add black formatting
2023-12-25 14:11:20 +05:30
Krrish Dholakia
0f14fb3797
docs(custom_callback.md): add async failure + streaming logging events to docs
...
https://github.com/BerriAI/litellm/issues/1125
2023-12-14 10:46:53 -08:00
Krrish Dholakia
effdddc1c8
fix(custom_logger.py): enable pre_call hooks to modify incoming data to proxy
2023-12-13 16:20:37 -08:00
Krrish Dholakia
dc148c37b0
refactor(custom_logger.py): add async log stream event function
2023-12-12 00:16:48 -08:00
Krrish Dholakia
4bf875d3ed
fix(router.py): fix least-busy routing
2023-12-08 20:29:49 -08:00
ishaan-jaff
b482b9002c
(feat) Custom_logger add async success & async failure
2023-12-06 17:16:24 -08:00
ishaan-jaff
b3f039627e
(feat) litellm - add _async_failure_callback
2023-12-06 14:43:47 -08:00
Krrish Dholakia
e0ccb281d8
feat(utils.py): add async success callbacks for custom functions
2023-12-04 16:42:40 -08:00
Krrish Dholakia
3e76d4b422
feat(router.py): add server cooldown logic
2023-11-22 15:59:48 -08:00
Krrish Dholakia
c3916a7754
feat(utils.py): adding additional states for custom logging
2023-11-04 17:07:20 -07:00
ishaan-jaff
16f598aa3f
(fix) allow using more than 1 custom callback
2023-10-19 09:11:58 -07:00