litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-26 19:24:27 +00:00

Author	SHA1	Message	Date
Krish Dholakia	94a05ca5d0	Litellm ruff linting enforcement (#5992 ) * ci(config.yml): add a 'check_code_quality' step Addresses https://github.com/BerriAI/litellm/issues/5991 * ci(config.yml): check why circle ci doesn't pick up this test * ci(config.yml): fix to run 'check_code_quality' tests * fix(__init__.py): fix unprotected import * fix(__init__.py): don't remove unused imports * build(ruff.toml): update ruff.toml to ignore unused imports * fix: fix: ruff + pyright - fix linting + type-checking errors * fix: fix linting errors * fix(lago.py): fix module init error * fix: fix linting errors * ci(config.yml): cd into correct dir for checks * fix(proxy_server.py): fix linting error * fix(utils.py): fix bare except causes ruff linting errors * fix: ruff - fix remaining linting errors * fix(clickhouse.py): use standard logging object * fix(__init__.py): fix unprotected import * fix: ruff - fix linting errors * fix: fix linting errors * ci(config.yml): cleanup code qa step (formatting handled in local_testing) * fix(_health_endpoints.py): fix ruff linting errors * ci(config.yml): just use ruff in check_code_quality pipeline for now * build(custom_guardrail.py): include missing file * style(embedding_handler.py): fix ruff check	2024-10-01 19:44:20 -04:00
Krrish Dholakia	bdf33df20e	fix(parallel_request_limiter.py): only update hidden params, don't set new (can lead to errors for responses where attribute can't be set)	2024-09-28 21:08:15 -07:00
Krrish Dholakia	6cb0842144	fix(parallel_request_limiter.py): make sure hidden params is dict before dereferencing	2024-09-28 21:08:15 -07:00
Krrish Dholakia	441d875536	fix(parallel_request_limiter.py): return remaining tpm/rpm in openai-compatible way Fixes https://github.com/BerriAI/litellm/issues/5957	2024-09-28 21:08:15 -07:00
Ishaan Jaff	963b548ece	fix use one async async_batch_set_cache (#5956 )	2024-09-28 09:59:38 -07:00
Krish Dholakia	02565cd58d	LiteLLM Minor Fixes & Improvements (09/27/2024) (#5938 ) * fix(langfuse.py): prevent double logging requester metadata Fixes https://github.com/BerriAI/litellm/issues/5935 * build(model_prices_and_context_window.json): add mistral pixtral cost tracking Closes https://github.com/BerriAI/litellm/issues/5837 * handle streaming for azure ai studio error * [Perf Proxy] parallel request limiter - use one cache update call (#5932) * fix parallel request limiter - use one cache update call * ci/cd run again * run ci/cd again * use docker username password * fix config.yml * fix config * fix config * fix config.yml * ci/cd run again * use correct typing for batch set cache * fix async_set_cache_pipeline * fix only check user id tpm / rpm limits when limits set * fix test_openai_azure_embedding_with_oidc_and_cf * fix(groq/chat/transformation.py): Fixes https://github.com/BerriAI/litellm/issues/5839 * feat(anthropic/chat.py): return 'retry-after' headers from anthropic Fixes https://github.com/BerriAI/litellm/issues/4387 * feat: raise validation error if message has tool calls without passing `tools` param for anthropic/bedrock Closes https://github.com/BerriAI/litellm/issues/5747 * [Feature]#5940, add max_workers parameter for the batch_completion (#5947) * handle streaming for azure ai studio error * bump: version 1.48.2 → 1.48.3 * docs(data_security.md): add legal/compliance faq's Make it easier for companies to use litellm * docs: resolve imports * [Feature]#5940, add max_workers parameter for the batch_completion method --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com> Co-authored-by: josearangos <josearangos@Joses-MacBook-Pro.local> * fix(converse_transformation.py): fix default message value * fix(utils.py): fix get_model_info to handle finetuned models Fixes issue for standard logging payloads, where model_map_value was null for finetuned openai models * fix(litellm_pre_call_utils.py): add debug statement for data sent after updating with team/key callbacks * fix: fix linting errors * fix(anthropic/chat/handler.py): fix cache creation input tokens * fix(exception_mapping_utils.py): fix missing imports * fix(anthropic/chat/handler.py): fix usage block translation * test: fix test * test: fix tests * style(types/utils.py): trigger new build * test: fix test --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Jose Alberto Arango Sanchez <jose.arangos@udea.edu.co> Co-authored-by: josearangos <josearangos@Joses-MacBook-Pro.local>	2024-09-27 22:52:57 -07:00
Ishaan Jaff	6b48c29a16	[Perf Proxy] parallel request limiter - use one cache update call (#5932 ) * fix parallel request limiter - use one cache update call * ci/cd run again * run ci/cd again * use docker username password * fix config.yml * fix config * fix config * fix config.yml * ci/cd run again * use correct typing for batch set cache * fix async_set_cache_pipeline * fix only check user id tpm / rpm limits when limits set * fix test_openai_azure_embedding_with_oidc_and_cf	2024-09-27 17:24:46 -07:00
Ishaan Jaff	cad944d031	[Fix proxy perf] Use correct cache key when reading from redis cache (#5928 ) * fix parallel request limiter use correct user id * async def get_user_object( fix * use safe get_internal_user_object * fix store internal users in redis correctly	2024-09-26 18:13:35 -07:00
Ishaan Jaff	4d253e473a	[Feat] Improve OTEL Tracking - Require all Redis Cache reads to be logged on OTEL (#5881 ) * fix use previous internal usage caching logic * fix test_dual_cache_uses_redis * redis track event_metadata in service logging * show otel error on _get_parent_otel_span_from_kwargs * track parent otel span on internal usage cache * update_request_status * fix internal usage cache * fix linting * fix test internal usage cache * fix linting error * show event metadata in redis set * fix test_get_team_redis * fix test_get_team_redis * test_proxy_logging_setup	2024-09-25 10:57:08 -07:00
Krish Dholakia	7107375ec3	LiteLLM Minor Fixes and Improvements (09/14/2024) (#5697 ) * fix(health_check.py): hide sensitive keys from health check debug information k * fix(route_llm_request.py): fix proxy model not found error message to indicate how to resolve issue * fix(vertex_llm_base.py): fix exception message to not log credentials	2024-09-14 10:32:39 -07:00
Ishaan Jaff	bfb5136489	refactor vertex endpoints to pass through all routes	2024-08-21 17:08:42 -07:00
Ishaan Jaff	94e74b9ede	inly write model tpm/rpm tracking when user set it	2024-08-18 09:58:09 -07:00
Ishaan Jaff	8578301116	fix async_pre_call_hook in parallel request limiter	2024-08-17 12:42:28 -07:00
Ishaan Jaff	db8f789318	Merge pull request #5259 from BerriAI/litellm_return_remaining_tokens_in_header [Feat] return `x-litellm-key-remaining-requests-{model}`: 1, `x-litellm-key-remaining-tokens-{model}: None` in response headers	2024-08-17 12:41:16 -07:00
Ishaan Jaff	9f6630912d	feat return rmng tokens for model for api key	2024-08-17 12:35:10 -07:00
Ishaan Jaff	a62277a6aa	feat - use commong helper for getting model group	2024-08-17 10:46:04 -07:00
Ishaan Jaff	03196742d2	add litellm-key-remaining-tokens on prometheus	2024-08-17 10:02:20 -07:00
Ishaan Jaff	8ae626b31f	feat add settings for rpm/tpm limits for a model	2024-08-17 09:16:01 -07:00
Ishaan Jaff	824ea32452	track rpm/tpm usage per key+model	2024-08-16 18:28:58 -07:00
Krrish Dholakia	2874b94fb1	refactor: replace .error() with .exception() logging for better debugging on sentry	2024-08-16 09:22:47 -07:00
Krrish Dholakia	e6bc7e938a	fix(utils.py): handle scenario where model="azure/*" and custom_llm_provider="azure" Fixes https://github.com/BerriAI/litellm/issues/4912	2024-08-02 17:48:53 -07:00
Ishaan Jaff	bda2ac1af5	fix raise better error when crossing tpm / rpm limits	2024-07-26 17:35:08 -07:00
Krrish Dholakia	af2055c2b7	feat(aporio_ai.py): support aporio ai prompt injection for chat completion requests Closes https://github.com/BerriAI/litellm/issues/2950	2024-07-17 16:38:47 -07:00
Krrish Dholakia	17635450cd	feat(proxy_server.py): return 'retry-after' param for rate limited requests Closes https://github.com/BerriAI/litellm/issues/4695	2024-07-13 17:15:20 -07:00
Krrish Dholakia	1d6643df22	feat(pass_through_endpoint.py): support enforcing key rpm limits on pass through endpoints Closes https://github.com/BerriAI/litellm/issues/4698	2024-07-13 13:29:44 -07:00
Krrish Dholakia	77328e4a28	fix(parallel_request_limiter.py): use redis cache, if available for rate limiting across instances Fixes https://github.com/BerriAI/litellm/issues/4148	2024-06-12 10:35:48 -07:00
Krrish Dholakia	56fd0c60d1	fix(parallel_request_limiter.py): fix user+team tpm/rpm limit check Closes https://github.com/BerriAI/litellm/issues/3788	2024-05-27 08:48:23 -07:00
Ishaan Jaff	eb58440ebf	feat - add end user rate limiting	2024-05-22 14:01:57 -07:00
Krrish Dholakia	3f339cb694	fix(parallel_request_limiter.py): fix max parallel request limiter on retries	2024-05-15 20:16:11 -07:00
Krrish Dholakia	737bb3e444	fix(proxy_server.py): fix tpm/rpm limiting for jwt auth fixes tpm/rpm limiting for jwt auth and implements unit tests for jwt auth	2024-03-28 21:19:34 -07:00
Krrish Dholakia	d2f47ee45b	fix(parallel_request_limiter.py): handle metadata being none	2024-03-14 10:02:41 -07:00
Krrish Dholakia	c963e2761b	feat(proxy_server.py): retry if virtual key is rate limited currently for chat completions	2024-03-05 19:00:03 -08:00
Krrish Dholakia	f72b84f6e0	fix(parallel_request_limiter.py): handle none scenario	2024-02-26 20:09:06 -08:00
Krrish Dholakia	7fff5119de	fix(parallel_request_limiter.py): fix team rate limit enforcement	2024-02-26 18:06:13 -08:00
Krrish Dholakia	5213fd2e1e	feat(parallel_request_limiter.py): enforce team based tpm / rpm limits	2024-02-26 16:20:41 -08:00
ishaan-jaff	5ec69a0ca5	(fix) failing parallel_Request_limiter test	2024-02-22 19:16:22 -08:00
ishaan-jaff	b728ded300	(fix) don't double check curr data and time	2024-02-22 18:50:02 -08:00
ishaan-jaff	74d66d5ac5	(feat) tpm/rpm limit by User	2024-02-22 18:44:03 -08:00
Krrish Dholakia	07aa05bf17	fix(test_parallel_request_limiter.py): use mock responses for streaming	2024-02-08 21:45:38 -08:00
ishaan-jaff	c8b2f0fd5d	(fix) parallel_request_limiter debug	2024-02-06 12:43:28 -08:00
Krrish Dholakia	dbf2b0b2c8	fix(utils.py): override default success callbacks with dynamic callbacks if set	2024-02-02 06:21:43 -08:00
Krrish Dholakia	c91ab81fde	fix(test_parallel_request_limiter): increase time limit for waiting for success logging event to happen	2024-01-30 13:26:17 -08:00
Krrish Dholakia	e957f41ab7	fix(utils.py): add metadata to logging obj on setup, if exists	2024-01-19 17:29:47 -08:00
Krrish Dholakia	f73a4ae7c2	fix(parallel_request_limiter.py): handle tpm/rpm limits being null	2024-01-19 10:22:27 -08:00
Krrish Dholakia	34c3b33b37	test(test_parallel_request_limiter.py): unit testing for tpm/rpm rate limits	2024-01-18 15:28:28 -08:00
Krrish Dholakia	13b013b28d	feat(parallel_request_limiter.py): add support for tpm/rpm limits	2024-01-18 13:52:15 -08:00
Krrish Dholakia	44553bcc3a	fix(parallel_request_limiter.py): decrement count for failed llm calls https://github.com/BerriAI/litellm/issues/1477	2024-01-18 12:42:14 -08:00
Krrish Dholakia	79978c44ba	refactor: add black formatting	2023-12-25 14:11:20 +05:30
Krrish Dholakia	018405b956	fix(proxy/utils.py): return different exceptions if key is invalid vs. expired https://github.com/BerriAI/litellm/issues/1230	2023-12-25 10:29:44 +05:30
Krrish Dholakia	72e8c84914	build(test_streaming.py): fix linting issues	2023-12-25 07:34:54 +05:30

1 2

54 commits