Commit graph

111 commits

Author SHA1 Message Date
Krish Dholakia
d57be47b0f
Litellm ruff linting enforcement (#5992)
* ci(config.yml): add a 'check_code_quality' step

Addresses https://github.com/BerriAI/litellm/issues/5991

* ci(config.yml): check why circle ci doesn't pick up this test

* ci(config.yml): fix to run 'check_code_quality' tests

* fix(__init__.py): fix unprotected import

* fix(__init__.py): don't remove unused imports

* build(ruff.toml): update ruff.toml to ignore unused imports

* fix: fix: ruff + pyright - fix linting + type-checking errors

* fix: fix linting errors

* fix(lago.py): fix module init error

* fix: fix linting errors

* ci(config.yml): cd into correct dir for checks

* fix(proxy_server.py): fix linting error

* fix(utils.py): fix bare except

causes ruff linting errors

* fix: ruff - fix remaining linting errors

* fix(clickhouse.py): use standard logging object

* fix(__init__.py): fix unprotected import

* fix: ruff - fix linting errors

* fix: fix linting errors

* ci(config.yml): cleanup code qa step (formatting handled in local_testing)

* fix(_health_endpoints.py): fix ruff linting errors

* ci(config.yml): just use ruff in check_code_quality pipeline for now

* build(custom_guardrail.py): include missing file

* style(embedding_handler.py): fix ruff check
2024-10-01 19:44:20 -04:00
Krrish Dholakia
6c7d1d5c96 fix(parallel_request_limiter.py): only update hidden params, don't set new (can lead to errors for responses where attribute can't be set) 2024-09-28 21:08:15 -07:00
Krrish Dholakia
3f8a5b3ef6 fix(parallel_request_limiter.py): make sure hidden params is dict before dereferencing 2024-09-28 21:08:15 -07:00
Krrish Dholakia
5222fc8e1b fix(parallel_request_limiter.py): return remaining tpm/rpm in openai-compatible way
Fixes https://github.com/BerriAI/litellm/issues/5957
2024-09-28 21:08:15 -07:00
Krrish Dholakia
efc06d4a03 fix(batch_redis_get.py): handle custom namespace
Fix https://github.com/BerriAI/litellm/issues/5917
2024-09-28 21:08:14 -07:00
Ishaan Jaff
088d906276
fix use one async async_batch_set_cache (#5956) 2024-09-28 09:59:38 -07:00
Krish Dholakia
0b30e212da
LiteLLM Minor Fixes & Improvements (09/27/2024) (#5938)
* fix(langfuse.py): prevent double logging requester metadata

Fixes https://github.com/BerriAI/litellm/issues/5935

* build(model_prices_and_context_window.json): add mistral pixtral cost tracking

Closes https://github.com/BerriAI/litellm/issues/5837

* handle streaming for azure ai studio error

* [Perf Proxy] parallel request limiter - use one cache update call (#5932)

* fix parallel request limiter - use one cache update call

* ci/cd run again

* run ci/cd again

* use docker username password

* fix config.yml

* fix config

* fix config

* fix config.yml

* ci/cd run again

* use correct typing for batch set cache

* fix async_set_cache_pipeline

* fix only check user id tpm / rpm limits when limits set

* fix test_openai_azure_embedding_with_oidc_and_cf

* fix(groq/chat/transformation.py): Fixes https://github.com/BerriAI/litellm/issues/5839

* feat(anthropic/chat.py): return 'retry-after' headers from anthropic

Fixes https://github.com/BerriAI/litellm/issues/4387

* feat: raise validation error if message has tool calls without passing `tools` param for anthropic/bedrock

Closes https://github.com/BerriAI/litellm/issues/5747

* [Feature]#5940, add max_workers parameter for the batch_completion (#5947)

* handle streaming for azure ai studio error

* bump: version 1.48.2 → 1.48.3

* docs(data_security.md): add legal/compliance faq's

Make it easier for companies to use litellm

* docs: resolve imports

* [Feature]#5940, add max_workers parameter for the batch_completion method

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: josearangos <josearangos@Joses-MacBook-Pro.local>

* fix(converse_transformation.py): fix default message value

* fix(utils.py): fix get_model_info to handle finetuned models

Fixes issue for standard logging payloads, where model_map_value was null for finetuned openai models

* fix(litellm_pre_call_utils.py): add debug statement for data sent after updating with team/key callbacks

* fix: fix linting errors

* fix(anthropic/chat/handler.py): fix cache creation input tokens

* fix(exception_mapping_utils.py): fix missing imports

* fix(anthropic/chat/handler.py): fix usage block translation

* test: fix test

* test: fix tests

* style(types/utils.py): trigger new build

* test: fix test

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Jose Alberto Arango Sanchez <jose.arangos@udea.edu.co>
Co-authored-by: josearangos <josearangos@Joses-MacBook-Pro.local>
2024-09-27 22:52:57 -07:00
Ishaan Jaff
f4613a100d [Perf Proxy] parallel request limiter - use one cache update call (#5932)
* fix parallel request limiter - use one cache update call

* ci/cd run again

* run ci/cd again

* use docker username password

* fix config.yml

* fix config

* fix config

* fix config.yml

* ci/cd run again

* use correct typing for batch set cache

* fix async_set_cache_pipeline

* fix only check user id tpm / rpm limits when limits set

* fix test_openai_azure_embedding_with_oidc_and_cf
2024-09-27 17:24:46 -07:00
Ishaan Jaff
58171f35ef
[Fix proxy perf] Use correct cache key when reading from redis cache (#5928)
* fix parallel request limiter use correct user id

* async def get_user_object(
fix

* use safe get_internal_user_object

* fix store internal users in redis correctly
2024-09-26 18:13:35 -07:00
Ishaan Jaff
7cbcf538c6
[Feat] Improve OTEL Tracking - Require all Redis Cache reads to be logged on OTEL (#5881)
* fix use previous internal usage caching logic

* fix test_dual_cache_uses_redis

* redis track event_metadata in service logging

* show otel error on _get_parent_otel_span_from_kwargs

* track parent otel span on internal usage cache

* update_request_status

* fix internal usage cache

* fix linting

* fix test internal usage cache

* fix linting error

* show event metadata in redis set

* fix test_get_team_redis

* fix test_get_team_redis

* test_proxy_logging_setup
2024-09-25 10:57:08 -07:00
Krish Dholakia
234185ec13
LiteLLM Minor Fixes & Improvements (09/16/2024) (#5723) (#5731)
* LiteLLM Minor Fixes & Improvements (09/16/2024)  (#5723)

* coverage (#5713)

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* Move (#5714)

Signed-off-by: dbczumar <corey.zumar@databricks.com>

* fix(litellm_logging.py): fix logging client re-init (#5710)

Fixes https://github.com/BerriAI/litellm/issues/5695

* fix(presidio.py): Fix logging_hook response and add support for additional presidio variables in guardrails config

Fixes https://github.com/BerriAI/litellm/issues/5682

* feat(o1_handler.py): fake streaming for openai o1 models

Fixes https://github.com/BerriAI/litellm/issues/5694

* docs: deprecated traceloop integration in favor of native otel (#5249)

* fix: fix linting errors

* fix: fix linting errors

* fix(main.py): fix o1 import

---------

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>

* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view (#5730)

* feat(spend_management_endpoints.py): expose `/global/spend/refresh` endpoint for updating material view

Supports having `MonthlyGlobalSpend` view be a material view, and exposes an endpoint to refresh it

* fix(custom_logger.py): reset calltype

* fix: fix linting errors

* fix: fix linting error

* fix: fix import

* test(test_databricks.py): fix databricks tests

---------

Signed-off-by: dbczumar <corey.zumar@databricks.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Nir Gazit <nirga@users.noreply.github.com>
2024-09-17 08:05:52 -07:00
Ishaan Jaff
b6ae2204a8
[Feat-Proxy] Slack Alerting - allow using os.environ/ vars for alert to webhook url (#5726)
* allow using os.environ for slack urls

* use env vars for webhook urls

* fix types for get_secret

* fix linting

* fix linting

* fix linting

* linting fixes

* linting fix

* docs alerting slack

* fix get data
2024-09-16 18:03:37 -07:00
Krish Dholakia
dad1ad2077
LiteLLM Minor Fixes and Improvements (09/14/2024) (#5697)
* fix(health_check.py): hide sensitive keys from health check debug information k

* fix(route_llm_request.py): fix proxy model not found error message to indicate how to resolve issue

* fix(vertex_llm_base.py): fix exception message to not log credentials
2024-09-14 10:32:39 -07:00
Ishaan Jaff
fb5be57bb8 v0 add rerank on litellm proxy 2024-08-27 17:28:39 -07:00
Ishaan Jaff
0e1d3804ff refactor vertex endpoints to pass through all routes 2024-08-21 17:08:42 -07:00
Ishaan Jaff
4685b9909a feat - allow accessing data post success call 2024-08-19 11:35:33 -07:00
Ishaan Jaff
398295116f inly write model tpm/rpm tracking when user set it 2024-08-18 09:58:09 -07:00
Ishaan Jaff
fa96610bbc fix async_pre_call_hook in parallel request limiter 2024-08-17 12:42:28 -07:00
Ishaan Jaff
feb8c3c5b4
Merge pull request #5259 from BerriAI/litellm_return_remaining_tokens_in_header
[Feat] return `x-litellm-key-remaining-requests-{model}`: 1, `x-litellm-key-remaining-tokens-{model}: None` in response headers
2024-08-17 12:41:16 -07:00
Ishaan Jaff
ee0f772b5c feat return rmng tokens for model for api key 2024-08-17 12:35:10 -07:00
Ishaan Jaff
5985c7e933 feat - use commong helper for getting model group 2024-08-17 10:46:04 -07:00
Ishaan Jaff
412d30d362 add litellm-key-remaining-tokens on prometheus 2024-08-17 10:02:20 -07:00
Ishaan Jaff
785482f023 feat add settings for rpm/tpm limits for a model 2024-08-17 09:16:01 -07:00
Ishaan Jaff
1ee33478c9 track rpm/tpm usage per key+model 2024-08-16 18:28:58 -07:00
Krrish Dholakia
61f4b71ef7 refactor: replace .error() with .exception() logging for better debugging on sentry 2024-08-16 09:22:47 -07:00
Krrish Dholakia
5d96ff6694 fix(utils.py): handle scenario where model="azure/*" and custom_llm_provider="azure"
Fixes https://github.com/BerriAI/litellm/issues/4912
2024-08-02 17:48:53 -07:00
Ishaan Jaff
c4e4b4675c fix raise better error when crossing tpm / rpm limits 2024-07-26 17:35:08 -07:00
Krrish Dholakia
07d90f6739 feat(aporio_ai.py): support aporio ai prompt injection for chat completion requests
Closes https://github.com/BerriAI/litellm/issues/2950
2024-07-17 16:38:47 -07:00
Krrish Dholakia
fde434be66 feat(proxy_server.py): return 'retry-after' param for rate limited requests
Closes https://github.com/BerriAI/litellm/issues/4695
2024-07-13 17:15:20 -07:00
Krrish Dholakia
7e769f3b89 fix: fix linting errors 2024-07-13 14:39:42 -07:00
Krrish Dholakia
0cc273d77b feat(pass_through_endpoint.py): support enforcing key rpm limits on pass through endpoints
Closes https://github.com/BerriAI/litellm/issues/4698
2024-07-13 13:29:44 -07:00
Krrish Dholakia
9d918d2ac7 fix(presidio_pii_masking.py): support logging_only pii masking 2024-07-11 18:04:12 -07:00
Krrish Dholakia
1193ee8803 fix(presidio_pii_masking.py): fix presidio unset url check + add same check for langfuse 2024-07-06 17:50:55 -07:00
Krrish Dholakia
d57d3df1d6 fix(presidio_pii_masking.py): add support for setting 'http://' if unset by render env for presidio base url 2024-07-06 17:42:10 -07:00
Krrish Dholakia
196b94455e fix(dynamic_rate_limiter.py): add rpm allocation, priority + quota reservation to docs 2024-07-01 23:35:42 -07:00
Krrish Dholakia
6b529d4e0e fix(dynamic_rate_limiter.py): support setting priority + reserving tpm/rpm 2024-07-01 23:08:54 -07:00
Krrish Dholakia
0781014706 test(test_dynamic_rate_limit_handler.py): refactor tests for rpm suppprt 2024-07-01 20:16:10 -07:00
Krrish Dholakia
f23b17091d fix(dynamic_rate_limiter.py): support dynamic rate limiting on rpm 2024-07-01 17:45:10 -07:00
Krrish Dholakia
bae7377128 docs(team_budgets.md): fix script
/
2024-06-22 15:42:05 -07:00
Krrish Dholakia
a31a05d45d feat(dynamic_rate_limiter.py): working e2e 2024-06-22 14:41:22 -07:00
Krrish Dholakia
532f24bfb7 refactor: instrument 'dynamic_rate_limiting' callback on proxy 2024-06-22 00:32:29 -07:00
Krrish Dholakia
068e8dff5b feat(dynamic_rate_limiter.py): passing base case 2024-06-21 22:46:46 -07:00
Krrish Dholakia
a028600932 feat(dynamic_rate_limiter.py): update cache with active project 2024-06-21 20:25:40 -07:00
Krrish Dholakia
2545da777b feat(dynamic_rate_limiter.py): initial commit for dynamic rate limiting
Closes https://github.com/BerriAI/litellm/issues/4124
2024-06-21 18:41:31 -07:00
Krish Dholakia
e61cd2e1e2
Merge branch 'main' into litellm_redis_cache_usage 2024-06-13 22:07:21 -07:00
Krrish Dholakia
3b913443fe feat(vertex_httpx.py): Moving to call vertex ai via httpx (instead of their sdk). Allows us to support all their api updates. 2024-06-12 16:47:00 -07:00
Krrish Dholakia
76c9b715f2 fix(parallel_request_limiter.py): use redis cache, if available for rate limiting across instances
Fixes https://github.com/BerriAI/litellm/issues/4148
2024-06-12 10:35:48 -07:00
Krrish Dholakia
af1ae80277 fix(litellm_pre_call_utils.py): add support for key level caching params 2024-06-07 22:09:14 -07:00
Krrish Dholakia
6cca5612d2 refactor: replace 'traceback.print_exc()' with logging library
allows error logs to be in json format for otel logging
2024-06-06 13:47:43 -07:00
Krrish Dholakia
4408b717f0 fix(parallel_request_limiter.py): fix user+team tpm/rpm limit check
Closes https://github.com/BerriAI/litellm/issues/3788
2024-05-27 08:48:23 -07:00