litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-27 19:54:13 +00:00

Author	SHA1	Message	Date
Ishaan Jaff	cad944d031	[Fix proxy perf] Use correct cache key when reading from redis cache (#5928 ) * fix parallel request limiter use correct user id * async def get_user_object( fix * use safe get_internal_user_object * fix store internal users in redis correctly	2024-09-26 18:13:35 -07:00
Ishaan Jaff	4d253e473a	[Feat] Improve OTEL Tracking - Require all Redis Cache reads to be logged on OTEL (#5881 ) * fix use previous internal usage caching logic * fix test_dual_cache_uses_redis * redis track event_metadata in service logging * show otel error on _get_parent_otel_span_from_kwargs * track parent otel span on internal usage cache * update_request_status * fix internal usage cache * fix linting * fix test internal usage cache * fix linting error * show event metadata in redis set * fix test_get_team_redis * fix test_get_team_redis * test_proxy_logging_setup	2024-09-25 10:57:08 -07:00
Krish Dholakia	82b542df8f	LiteLLM Minor Fixes & Improvements (09/16/2024) (#5723 ) (#5731 ) * LiteLLM Minor Fixes & Improvements (09/16/2024) (#5723) * coverage (#5713) Signed-off-by: dbczumar <corey.zumar@databricks.com> * Move (#5714) Signed-off-by: dbczumar <corey.zumar@databricks.com> * fix(litellm_logging.py): fix logging client re-init (#5710) Fixes https://github.com/BerriAI/litellm/issues/5695 * fix(presidio.py): Fix logging_hook response and add support for additional presidio variables in guardrails config Fixes https://github.com/BerriAI/litellm/issues/5682 * feat(o1_handler.py): fake streaming for openai o1 models Fixes https://github.com/BerriAI/litellm/issues/5694 * docs: deprecated traceloop integration in favor of native otel (#5249) * fix: fix linting errors * fix: fix linting errors * fix(main.py): fix o1 import --------- Signed-off-by: dbczumar <corey.zumar@databricks.com> Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com> Co-authored-by: Nir Gazit <nirga@users.noreply.github.com> * feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view (#5730) * feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view Supports having `MonthlyGlobalSpend` view be a material view, and exposes an endpoint to refresh it * fix(custom_logger.py): reset calltype * fix: fix linting errors * fix: fix linting error * fix: fix import * test(test_databricks.py): fix databricks tests --------- Signed-off-by: dbczumar <corey.zumar@databricks.com> Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com> Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>	2024-09-17 08:05:52 -07:00
Ishaan Jaff	0d027b22fd	[Feat-Proxy] Slack Alerting - allow using os.environ/ vars for alert to webhook url (#5726 ) * allow using os.environ for slack urls * use env vars for webhook urls * fix types for get_secret * fix linting * fix linting * fix linting * linting fixes * linting fix * docs alerting slack * fix get data	2024-09-16 18:03:37 -07:00
Krish Dholakia	7107375ec3	LiteLLM Minor Fixes and Improvements (09/14/2024) (#5697 ) * fix(health_check.py): hide sensitive keys from health check debug information k * fix(route_llm_request.py): fix proxy model not found error message to indicate how to resolve issue * fix(vertex_llm_base.py): fix exception message to not log credentials	2024-09-14 10:32:39 -07:00
Ishaan Jaff	359a003ac8	v0 add rerank on litellm proxy	2024-08-27 17:28:39 -07:00
Ishaan Jaff	bfb5136489	refactor vertex endpoints to pass through all routes	2024-08-21 17:08:42 -07:00
Ishaan Jaff	b4bca8db82	feat - allow accessing data post success call	2024-08-19 11:35:33 -07:00
Ishaan Jaff	94e74b9ede	inly write model tpm/rpm tracking when user set it	2024-08-18 09:58:09 -07:00
Ishaan Jaff	8578301116	fix async_pre_call_hook in parallel request limiter	2024-08-17 12:42:28 -07:00
Ishaan Jaff	db8f789318	Merge pull request #5259 from BerriAI/litellm_return_remaining_tokens_in_header [Feat] return `x-litellm-key-remaining-requests-{model}`: 1, `x-litellm-key-remaining-tokens-{model}: None` in response headers	2024-08-17 12:41:16 -07:00
Ishaan Jaff	9f6630912d	feat return rmng tokens for model for api key	2024-08-17 12:35:10 -07:00
Ishaan Jaff	a62277a6aa	feat - use commong helper for getting model group	2024-08-17 10:46:04 -07:00
Ishaan Jaff	03196742d2	add litellm-key-remaining-tokens on prometheus	2024-08-17 10:02:20 -07:00
Ishaan Jaff	8ae626b31f	feat add settings for rpm/tpm limits for a model	2024-08-17 09:16:01 -07:00
Ishaan Jaff	824ea32452	track rpm/tpm usage per key+model	2024-08-16 18:28:58 -07:00
Krrish Dholakia	2874b94fb1	refactor: replace .error() with .exception() logging for better debugging on sentry	2024-08-16 09:22:47 -07:00
Krrish Dholakia	e6bc7e938a	fix(utils.py): handle scenario where model="azure/*" and custom_llm_provider="azure" Fixes https://github.com/BerriAI/litellm/issues/4912	2024-08-02 17:48:53 -07:00
Ishaan Jaff	bda2ac1af5	fix raise better error when crossing tpm / rpm limits	2024-07-26 17:35:08 -07:00
Krrish Dholakia	af2055c2b7	feat(aporio_ai.py): support aporio ai prompt injection for chat completion requests Closes https://github.com/BerriAI/litellm/issues/2950	2024-07-17 16:38:47 -07:00
Krrish Dholakia	17635450cd	feat(proxy_server.py): return 'retry-after' param for rate limited requests Closes https://github.com/BerriAI/litellm/issues/4695	2024-07-13 17:15:20 -07:00
Krrish Dholakia	4ca677638f	fix: fix linting errors	2024-07-13 14:39:42 -07:00
Krrish Dholakia	1d6643df22	feat(pass_through_endpoint.py): support enforcing key rpm limits on pass through endpoints Closes https://github.com/BerriAI/litellm/issues/4698	2024-07-13 13:29:44 -07:00
Krrish Dholakia	1a57e49e46	fix(presidio_pii_masking.py): support logging_only pii masking	2024-07-11 18:04:12 -07:00
Krrish Dholakia	bcd7358daf	fix(presidio_pii_masking.py): fix presidio unset url check + add same check for langfuse	2024-07-06 17:50:55 -07:00
Krrish Dholakia	e424fea721	fix(presidio_pii_masking.py): add support for setting 'http://' if unset by render env for presidio base url	2024-07-06 17:42:10 -07:00
Krrish Dholakia	c1a1529582	fix(dynamic_rate_limiter.py): add rpm allocation, priority + quota reservation to docs	2024-07-01 23:35:42 -07:00
Krrish Dholakia	0bc08063e1	fix(dynamic_rate_limiter.py): support setting priority + reserving tpm/rpm	2024-07-01 23:08:54 -07:00
Krrish Dholakia	f74490c69b	test(test_dynamic_rate_limit_handler.py): refactor tests for rpm suppprt	2024-07-01 20:16:10 -07:00
Krrish Dholakia	d528e263c2	fix(dynamic_rate_limiter.py): support dynamic rate limiting on rpm	2024-07-01 17:45:10 -07:00
Krrish Dholakia	1e4f8744e6	docs(team_budgets.md): fix script /	2024-06-22 15:42:05 -07:00
Krrish Dholakia	8843b0dc77	feat(dynamic_rate_limiter.py): working e2e	2024-06-22 14:41:22 -07:00
Krrish Dholakia	8f95381276	refactor: instrument 'dynamic_rate_limiting' callback on proxy	2024-06-22 00:32:29 -07:00
Krrish Dholakia	6a7982fa40	feat(dynamic_rate_limiter.py): passing base case	2024-06-21 22:46:46 -07:00
Krrish Dholakia	0430807178	feat(dynamic_rate_limiter.py): update cache with active project	2024-06-21 20:25:40 -07:00
Krrish Dholakia	89dba82be9	feat(dynamic_rate_limiter.py): initial commit for dynamic rate limiting Closes https://github.com/BerriAI/litellm/issues/4124	2024-06-21 18:41:31 -07:00
Krish Dholakia	c373f104cc	Merge branch 'main' into litellm_redis_cache_usage	2024-06-13 22:07:21 -07:00
Krrish Dholakia	29169b3039	feat(vertex_httpx.py): Moving to call vertex ai via httpx (instead of their sdk). Allows us to support all their api updates.	2024-06-12 16:47:00 -07:00
Krrish Dholakia	77328e4a28	fix(parallel_request_limiter.py): use redis cache, if available for rate limiting across instances Fixes https://github.com/BerriAI/litellm/issues/4148	2024-06-12 10:35:48 -07:00
Krrish Dholakia	22b51c5af4	fix(litellm_pre_call_utils.py): add support for key level caching params	2024-06-07 22:09:14 -07:00
Krrish Dholakia	e391e30285	refactor: replace 'traceback.print_exc()' with logging library allows error logs to be in json format for otel logging	2024-06-06 13:47:43 -07:00
Krrish Dholakia	56fd0c60d1	fix(parallel_request_limiter.py): fix user+team tpm/rpm limit check Closes https://github.com/BerriAI/litellm/issues/3788	2024-05-27 08:48:23 -07:00
Ishaan Jaff	eb58440ebf	feat - add end user rate limiting	2024-05-22 14:01:57 -07:00
Krrish Dholakia	d4d4550bb6	fix(proxy_server.py): fixes for making rejected responses work with streaming	2024-05-20 12:32:19 -07:00
Krrish Dholakia	8fb8d068fb	feat(proxy_server.py): refactor returning rejected message, to work with error logging log the rejected request as a failed call to langfuse/slack alerting	2024-05-20 11:14:36 -07:00
Krrish Dholakia	3f339cb694	fix(parallel_request_limiter.py): fix max parallel request limiter on retries	2024-05-15 20:16:11 -07:00
Ishaan Jaff	9cc30e32b3	(Fix) - linting errors	2024-05-11 15:57:06 -07:00
Lunik	5f43a7b511	🔊 fix: Correctly use verbose logging Signed-off-by: Lunik <lunik@tiwabbit.fr>	2024-05-04 11:04:23 +02:00
Lunik	38d4cbc511	✨ feat: Use 8 severity levels for azure content safety Signed-off-by: Lunik <lunik@tiwabbit.fr>	2024-05-04 10:45:39 +02:00
Lunik	d69a1eeb4f	📝 doc: Azure content safety Proxy usage Signed-off-by: Lunik <lunik@tiwabbit.fr>	2024-05-04 10:39:43 +02:00

1 2 3 4

153 commits