Commit graph

54 commits

Author SHA1 Message Date
Krish Dholakia
94a05ca5d0 Litellm ruff linting enforcement (#5992)
* ci(config.yml): add a 'check_code_quality' step

Addresses https://github.com/BerriAI/litellm/issues/5991

* ci(config.yml): check why circle ci doesn't pick up this test

* ci(config.yml): fix to run 'check_code_quality' tests

* fix(__init__.py): fix unprotected import

* fix(__init__.py): don't remove unused imports

* build(ruff.toml): update ruff.toml to ignore unused imports

* fix: fix: ruff + pyright - fix linting + type-checking errors

* fix: fix linting errors

* fix(lago.py): fix module init error

* fix: fix linting errors

* ci(config.yml): cd into correct dir for checks

* fix(proxy_server.py): fix linting error

* fix(utils.py): fix bare except

causes ruff linting errors

* fix: ruff - fix remaining linting errors

* fix(clickhouse.py): use standard logging object

* fix(__init__.py): fix unprotected import

* fix: ruff - fix linting errors

* fix: fix linting errors

* ci(config.yml): cleanup code qa step (formatting handled in local_testing)

* fix(_health_endpoints.py): fix ruff linting errors

* ci(config.yml): just use ruff in check_code_quality pipeline for now

* build(custom_guardrail.py): include missing file

* style(embedding_handler.py): fix ruff check
2024-10-01 19:44:20 -04:00
Krrish Dholakia
bdf33df20e fix(parallel_request_limiter.py): only update hidden params, don't set new (can lead to errors for responses where attribute can't be set) 2024-09-28 21:08:15 -07:00
Krrish Dholakia
6cb0842144 fix(parallel_request_limiter.py): make sure hidden params is dict before dereferencing 2024-09-28 21:08:15 -07:00
Krrish Dholakia
441d875536 fix(parallel_request_limiter.py): return remaining tpm/rpm in openai-compatible way
Fixes https://github.com/BerriAI/litellm/issues/5957
2024-09-28 21:08:15 -07:00
Ishaan Jaff
963b548ece fix use one async async_batch_set_cache (#5956) 2024-09-28 09:59:38 -07:00
Krish Dholakia
02565cd58d LiteLLM Minor Fixes & Improvements (09/27/2024) (#5938)
* fix(langfuse.py): prevent double logging requester metadata

Fixes https://github.com/BerriAI/litellm/issues/5935

* build(model_prices_and_context_window.json): add mistral pixtral cost tracking

Closes https://github.com/BerriAI/litellm/issues/5837

* handle streaming for azure ai studio error

* [Perf Proxy] parallel request limiter - use one cache update call (#5932)

* fix parallel request limiter - use one cache update call

* ci/cd run again

* run ci/cd again

* use docker username password

* fix config.yml

* fix config

* fix config

* fix config.yml

* ci/cd run again

* use correct typing for batch set cache

* fix async_set_cache_pipeline

* fix only check user id tpm / rpm limits when limits set

* fix test_openai_azure_embedding_with_oidc_and_cf

* fix(groq/chat/transformation.py): Fixes https://github.com/BerriAI/litellm/issues/5839

* feat(anthropic/chat.py): return 'retry-after' headers from anthropic

Fixes https://github.com/BerriAI/litellm/issues/4387

* feat: raise validation error if message has tool calls without passing `tools` param for anthropic/bedrock

Closes https://github.com/BerriAI/litellm/issues/5747

* [Feature]#5940, add max_workers parameter for the batch_completion (#5947)

* handle streaming for azure ai studio error

* bump: version 1.48.2 → 1.48.3

* docs(data_security.md): add legal/compliance faq's

Make it easier for companies to use litellm

* docs: resolve imports

* [Feature]#5940, add max_workers parameter for the batch_completion method

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: josearangos <josearangos@Joses-MacBook-Pro.local>

* fix(converse_transformation.py): fix default message value

* fix(utils.py): fix get_model_info to handle finetuned models

Fixes issue for standard logging payloads, where model_map_value was null for finetuned openai models

* fix(litellm_pre_call_utils.py): add debug statement for data sent after updating with team/key callbacks

* fix: fix linting errors

* fix(anthropic/chat/handler.py): fix cache creation input tokens

* fix(exception_mapping_utils.py): fix missing imports

* fix(anthropic/chat/handler.py): fix usage block translation

* test: fix test

* test: fix tests

* style(types/utils.py): trigger new build

* test: fix test

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Jose Alberto Arango Sanchez <jose.arangos@udea.edu.co>
Co-authored-by: josearangos <josearangos@Joses-MacBook-Pro.local>
2024-09-27 22:52:57 -07:00
Ishaan Jaff
6b48c29a16 [Perf Proxy] parallel request limiter - use one cache update call (#5932)
* fix parallel request limiter - use one cache update call

* ci/cd run again

* run ci/cd again

* use docker username password

* fix config.yml

* fix config

* fix config

* fix config.yml

* ci/cd run again

* use correct typing for batch set cache

* fix async_set_cache_pipeline

* fix only check user id tpm / rpm limits when limits set

* fix test_openai_azure_embedding_with_oidc_and_cf
2024-09-27 17:24:46 -07:00
Ishaan Jaff
cad944d031 [Fix proxy perf] Use correct cache key when reading from redis cache (#5928)
* fix parallel request limiter use correct user id

* async def get_user_object(
fix

* use safe get_internal_user_object

* fix store internal users in redis correctly
2024-09-26 18:13:35 -07:00
Ishaan Jaff
4d253e473a [Feat] Improve OTEL Tracking - Require all Redis Cache reads to be logged on OTEL (#5881)
* fix use previous internal usage caching logic

* fix test_dual_cache_uses_redis

* redis track event_metadata in service logging

* show otel error on _get_parent_otel_span_from_kwargs

* track parent otel span on internal usage cache

* update_request_status

* fix internal usage cache

* fix linting

* fix test internal usage cache

* fix linting error

* show event metadata in redis set

* fix test_get_team_redis

* fix test_get_team_redis

* test_proxy_logging_setup
2024-09-25 10:57:08 -07:00
Krish Dholakia
7107375ec3 LiteLLM Minor Fixes and Improvements (09/14/2024) (#5697)
* fix(health_check.py): hide sensitive keys from health check debug information k

* fix(route_llm_request.py): fix proxy model not found error message to indicate how to resolve issue

* fix(vertex_llm_base.py): fix exception message to not log credentials
2024-09-14 10:32:39 -07:00
Ishaan Jaff
bfb5136489 refactor vertex endpoints to pass through all routes 2024-08-21 17:08:42 -07:00
Ishaan Jaff
94e74b9ede inly write model tpm/rpm tracking when user set it 2024-08-18 09:58:09 -07:00
Ishaan Jaff
8578301116 fix async_pre_call_hook in parallel request limiter 2024-08-17 12:42:28 -07:00
Ishaan Jaff
db8f789318 Merge pull request #5259 from BerriAI/litellm_return_remaining_tokens_in_header
[Feat] return `x-litellm-key-remaining-requests-{model}`: 1, `x-litellm-key-remaining-tokens-{model}: None` in response headers
2024-08-17 12:41:16 -07:00
Ishaan Jaff
9f6630912d feat return rmng tokens for model for api key 2024-08-17 12:35:10 -07:00
Ishaan Jaff
a62277a6aa feat - use commong helper for getting model group 2024-08-17 10:46:04 -07:00
Ishaan Jaff
03196742d2 add litellm-key-remaining-tokens on prometheus 2024-08-17 10:02:20 -07:00
Ishaan Jaff
8ae626b31f feat add settings for rpm/tpm limits for a model 2024-08-17 09:16:01 -07:00
Ishaan Jaff
824ea32452 track rpm/tpm usage per key+model 2024-08-16 18:28:58 -07:00
Krrish Dholakia
2874b94fb1 refactor: replace .error() with .exception() logging for better debugging on sentry 2024-08-16 09:22:47 -07:00
Krrish Dholakia
e6bc7e938a fix(utils.py): handle scenario where model="azure/*" and custom_llm_provider="azure"
Fixes https://github.com/BerriAI/litellm/issues/4912
2024-08-02 17:48:53 -07:00
Ishaan Jaff
bda2ac1af5 fix raise better error when crossing tpm / rpm limits 2024-07-26 17:35:08 -07:00
Krrish Dholakia
af2055c2b7 feat(aporio_ai.py): support aporio ai prompt injection for chat completion requests
Closes https://github.com/BerriAI/litellm/issues/2950
2024-07-17 16:38:47 -07:00
Krrish Dholakia
17635450cd feat(proxy_server.py): return 'retry-after' param for rate limited requests
Closes https://github.com/BerriAI/litellm/issues/4695
2024-07-13 17:15:20 -07:00
Krrish Dholakia
1d6643df22 feat(pass_through_endpoint.py): support enforcing key rpm limits on pass through endpoints
Closes https://github.com/BerriAI/litellm/issues/4698
2024-07-13 13:29:44 -07:00
Krrish Dholakia
77328e4a28 fix(parallel_request_limiter.py): use redis cache, if available for rate limiting across instances
Fixes https://github.com/BerriAI/litellm/issues/4148
2024-06-12 10:35:48 -07:00
Krrish Dholakia
56fd0c60d1 fix(parallel_request_limiter.py): fix user+team tpm/rpm limit check
Closes https://github.com/BerriAI/litellm/issues/3788
2024-05-27 08:48:23 -07:00
Ishaan Jaff
eb58440ebf feat - add end user rate limiting 2024-05-22 14:01:57 -07:00
Krrish Dholakia
3f339cb694 fix(parallel_request_limiter.py): fix max parallel request limiter on retries 2024-05-15 20:16:11 -07:00
Krrish Dholakia
737bb3e444 fix(proxy_server.py): fix tpm/rpm limiting for jwt auth
fixes tpm/rpm limiting for jwt auth and implements unit tests for jwt auth
2024-03-28 21:19:34 -07:00
Krrish Dholakia
d2f47ee45b fix(parallel_request_limiter.py): handle metadata being none 2024-03-14 10:02:41 -07:00
Krrish Dholakia
c963e2761b feat(proxy_server.py): retry if virtual key is rate limited
currently for chat completions
2024-03-05 19:00:03 -08:00
Krrish Dholakia
f72b84f6e0 fix(parallel_request_limiter.py): handle none scenario 2024-02-26 20:09:06 -08:00
Krrish Dholakia
7fff5119de fix(parallel_request_limiter.py): fix team rate limit enforcement 2024-02-26 18:06:13 -08:00
Krrish Dholakia
5213fd2e1e feat(parallel_request_limiter.py): enforce team based tpm / rpm limits 2024-02-26 16:20:41 -08:00
ishaan-jaff
5ec69a0ca5 (fix) failing parallel_Request_limiter test 2024-02-22 19:16:22 -08:00
ishaan-jaff
b728ded300 (fix) don't double check curr data and time 2024-02-22 18:50:02 -08:00
ishaan-jaff
74d66d5ac5 (feat) tpm/rpm limit by User 2024-02-22 18:44:03 -08:00
Krrish Dholakia
07aa05bf17 fix(test_parallel_request_limiter.py): use mock responses for streaming 2024-02-08 21:45:38 -08:00
ishaan-jaff
c8b2f0fd5d (fix) parallel_request_limiter debug 2024-02-06 12:43:28 -08:00
Krrish Dholakia
dbf2b0b2c8 fix(utils.py): override default success callbacks with dynamic callbacks if set 2024-02-02 06:21:43 -08:00
Krrish Dholakia
c91ab81fde fix(test_parallel_request_limiter): increase time limit for waiting for success logging event to happen 2024-01-30 13:26:17 -08:00
Krrish Dholakia
e957f41ab7 fix(utils.py): add metadata to logging obj on setup, if exists 2024-01-19 17:29:47 -08:00
Krrish Dholakia
f73a4ae7c2 fix(parallel_request_limiter.py): handle tpm/rpm limits being null 2024-01-19 10:22:27 -08:00
Krrish Dholakia
34c3b33b37 test(test_parallel_request_limiter.py): unit testing for tpm/rpm rate limits 2024-01-18 15:28:28 -08:00
Krrish Dholakia
13b013b28d feat(parallel_request_limiter.py): add support for tpm/rpm limits 2024-01-18 13:52:15 -08:00
Krrish Dholakia
44553bcc3a fix(parallel_request_limiter.py): decrement count for failed llm calls
https://github.com/BerriAI/litellm/issues/1477
2024-01-18 12:42:14 -08:00
Krrish Dholakia
79978c44ba refactor: add black formatting 2023-12-25 14:11:20 +05:30
Krrish Dholakia
018405b956 fix(proxy/utils.py): return different exceptions if key is invalid vs. expired
https://github.com/BerriAI/litellm/issues/1230
2023-12-25 10:29:44 +05:30
Krrish Dholakia
72e8c84914 build(test_streaming.py): fix linting issues 2023-12-25 07:34:54 +05:30