litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-26 03:04:13 +00:00

Author	SHA1	Message	Date
Krrish Dholakia	fde434be66	feat(proxy_server.py): return 'retry-after' param for rate limited requests Closes https://github.com/BerriAI/litellm/issues/4695	2024-07-13 17:15:20 -07:00
Krrish Dholakia	7e769f3b89	fix: fix linting errors	2024-07-13 14:39:42 -07:00
Krrish Dholakia	0cc273d77b	feat(pass_through_endpoint.py): support enforcing key rpm limits on pass through endpoints Closes https://github.com/BerriAI/litellm/issues/4698	2024-07-13 13:29:44 -07:00
Krrish Dholakia	9d918d2ac7	fix(presidio_pii_masking.py): support logging_only pii masking	2024-07-11 18:04:12 -07:00
Krrish Dholakia	1193ee8803	fix(presidio_pii_masking.py): fix presidio unset url check + add same check for langfuse	2024-07-06 17:50:55 -07:00
Krrish Dholakia	d57d3df1d6	fix(presidio_pii_masking.py): add support for setting 'http://' if unset by render env for presidio base url	2024-07-06 17:42:10 -07:00
Krrish Dholakia	196b94455e	fix(dynamic_rate_limiter.py): add rpm allocation, priority + quota reservation to docs	2024-07-01 23:35:42 -07:00
Krrish Dholakia	6b529d4e0e	fix(dynamic_rate_limiter.py): support setting priority + reserving tpm/rpm	2024-07-01 23:08:54 -07:00
Krrish Dholakia	0781014706	test(test_dynamic_rate_limit_handler.py): refactor tests for rpm suppprt	2024-07-01 20:16:10 -07:00
Krrish Dholakia	f23b17091d	fix(dynamic_rate_limiter.py): support dynamic rate limiting on rpm	2024-07-01 17:45:10 -07:00
Krrish Dholakia	bae7377128	docs(team_budgets.md): fix script /	2024-06-22 15:42:05 -07:00
Krrish Dholakia	a31a05d45d	feat(dynamic_rate_limiter.py): working e2e	2024-06-22 14:41:22 -07:00
Krrish Dholakia	532f24bfb7	refactor: instrument 'dynamic_rate_limiting' callback on proxy	2024-06-22 00:32:29 -07:00
Krrish Dholakia	068e8dff5b	feat(dynamic_rate_limiter.py): passing base case	2024-06-21 22:46:46 -07:00
Krrish Dholakia	a028600932	feat(dynamic_rate_limiter.py): update cache with active project	2024-06-21 20:25:40 -07:00
Krrish Dholakia	2545da777b	feat(dynamic_rate_limiter.py): initial commit for dynamic rate limiting Closes https://github.com/BerriAI/litellm/issues/4124	2024-06-21 18:41:31 -07:00
Krish Dholakia	e61cd2e1e2	Merge branch 'main' into litellm_redis_cache_usage	2024-06-13 22:07:21 -07:00
Krrish Dholakia	3b913443fe	feat(vertex_httpx.py): Moving to call vertex ai via httpx (instead of their sdk). Allows us to support all their api updates.	2024-06-12 16:47:00 -07:00
Krrish Dholakia	76c9b715f2	fix(parallel_request_limiter.py): use redis cache, if available for rate limiting across instances Fixes https://github.com/BerriAI/litellm/issues/4148	2024-06-12 10:35:48 -07:00
Krrish Dholakia	af1ae80277	fix(litellm_pre_call_utils.py): add support for key level caching params	2024-06-07 22:09:14 -07:00
Krrish Dholakia	6cca5612d2	refactor: replace 'traceback.print_exc()' with logging library allows error logs to be in json format for otel logging	2024-06-06 13:47:43 -07:00
Krrish Dholakia	4408b717f0	fix(parallel_request_limiter.py): fix user+team tpm/rpm limit check Closes https://github.com/BerriAI/litellm/issues/3788	2024-05-27 08:48:23 -07:00
Ishaan Jaff	106910cecf	feat - add end user rate limiting	2024-05-22 14:01:57 -07:00
Krrish Dholakia	b41f30ca60	fix(proxy_server.py): fixes for making rejected responses work with streaming	2024-05-20 12:32:19 -07:00
Krrish Dholakia	f11f207ae6	feat(proxy_server.py): refactor returning rejected message, to work with error logging log the rejected request as a failed call to langfuse/slack alerting	2024-05-20 11:14:36 -07:00
Krrish Dholakia	594ca947c8	fix(parallel_request_limiter.py): fix max parallel request limiter on retries	2024-05-15 20:16:11 -07:00
Ishaan Jaff	91a6a0eef4	(Fix) - linting errors	2024-05-11 15:57:06 -07:00
Lunik	1639a51f24	🔊 fix: Correctly use verbose logging Signed-off-by: Lunik <lunik@tiwabbit.fr>	2024-05-04 11:04:23 +02:00
Lunik	8783fd4895	✨ feat: Use 8 severity levels for azure content safety Signed-off-by: Lunik <lunik@tiwabbit.fr>	2024-05-04 10:45:39 +02:00
Lunik	cb178723ca	📝 doc: Azure content safety Proxy usage Signed-off-by: Lunik <lunik@tiwabbit.fr>	2024-05-04 10:39:43 +02:00
Lunik	9ba9b3891f	⚡️ perf: Remove test violation on each stream chunk Signed-off-by: Lunik <lunik@tiwabbit.fr>	2024-05-03 20:51:40 +02:00
Lunik	e7405f105c	✅ ci: Add tests Signed-off-by: Lunik <lunik@tiwabbit.fr>	2024-05-03 20:50:37 +02:00
Lunik	6cec252b07	✨ feat: Add Azure Content-Safety Proxy hooks Signed-off-by: Lunik <lunik@tiwabbit.fr>	2024-05-02 23:21:08 +02:00
Krrish Dholakia	180cf9bd5c	feat(lowest_tpm_rpm_v2.py): move to using redis.incr and redis.mget for getting model usage from redis makes routing work across multiple instances	2024-04-10 14:56:23 -07:00
Krrish Dholakia	6467dd4e11	fix(tpm_rpm_limiter.py): fix cache init logic	2024-04-01 18:01:38 -07:00
Krrish Dholakia	383f12bbd3	test(test_max_tpm_rpm_limiter.py): unit tests for key + team based tpm rpm limits on proxy	2024-04-01 08:00:01 -07:00
Krrish Dholakia	f58fefd589	fix(tpm_rpm_limiter.py): enable redis caching for tpm/rpm checks on keys/user/teams allows tpm/rpm checks to work across instances https://github.com/BerriAI/litellm/issues/2730	2024-03-30 20:01:36 -07:00
Krrish Dholakia	5a117490ec	fix(proxy_server.py): fix tpm/rpm limiting for jwt auth fixes tpm/rpm limiting for jwt auth and implements unit tests for jwt auth	2024-03-28 21:19:34 -07:00
Krrish Dholakia	e10eb8f6fe	feat(llm_guard.py): enable key-specific llm guard check	2024-03-26 17:21:51 -07:00
Ishaan Jaff	5d121a9f3c	(fix) stop using f strings with logger	2024-03-25 10:47:18 -07:00
Krrish Dholakia	0521e8a1d9	fix(prompt_injection_detection.py): fix type check	2024-03-21 08:56:13 -07:00
Krrish Dholakia	8e8c4e214e	fix: fix linting issue	2024-03-21 08:19:09 -07:00
Krrish Dholakia	d91f9a9f50	feat(proxy_server.py): enable llm api based prompt injection checks run user calls through an llm api to check for prompt injection attacks. This happens in parallel to th e actual llm call using `async_moderation_hook`	2024-03-20 22:43:42 -07:00
Krrish Dholakia	f24d3ffdb6	fix(proxy_server.py): fix import	2024-03-20 19:15:06 -07:00
Krrish Dholakia	3bb0e24cb7	fix(prompt_injection_detection.py): ensure combinations are actual phrases, not just 1-2 words reduces misflagging https://github.com/BerriAI/litellm/issues/2601	2024-03-20 19:09:38 -07:00
Krrish Dholakia	8a20ea795b	feat(batch_redis_get.py): batch redis GET requests for a given key + call type reduces number of redis requests. 85ms latency improvement over 3 minutes of load (19k requests).	2024-03-15 14:54:16 -07:00
Krrish Dholakia	226953e1d8	feat(batch_redis_get.py): batch redis GET requests for a given key + call type reduces the number of GET requests we're making in high-throughput scenarios	2024-03-15 14:40:11 -07:00
Krrish Dholakia	7876aa2d75	fix(parallel_request_limiter.py): handle metadata being none	2024-03-14 10:02:41 -07:00
Krrish Dholakia	ad55f4dbb5	feat(proxy_server.py): retry if virtual key is rate limited currently for chat completions	2024-03-05 19:00:03 -08:00
Krrish Dholakia	b3574f2b37	fix(parallel_request_limiter.py): handle none scenario	2024-02-26 20:09:06 -08:00

1 2

83 commits