Commit graph

196 commits

Author SHA1 Message Date
Ishaan Jaff
4d253e473a [Feat] Improve OTEL Tracking - Require all Redis Cache reads to be logged on OTEL (#5881)
* fix use previous internal usage caching logic

* fix test_dual_cache_uses_redis

* redis track event_metadata in service logging

* show otel error on _get_parent_otel_span_from_kwargs

* track parent otel span on internal usage cache

* update_request_status

* fix internal usage cache

* fix linting

* fix test internal usage cache

* fix linting error

* show event metadata in redis set

* fix test_get_team_redis

* fix test_get_team_redis

* test_proxy_logging_setup
2024-09-25 10:57:08 -07:00
Ishaan Jaff
f581dceb4e [Perf Fix] Don't always read from Redis by Default (#5877)
* fix use previous internal usage caching logic

* fix test_dual_cache_uses_redis
2024-09-24 21:34:18 -07:00
Krish Dholakia
f3fa2160a0 LiteLLM Minor Fixes & Improvements (09/21/2024) (#5819)
* fix(router.py): fix error message

* Litellm disable keys (#5814)

* build(schema.prisma): allow blocking/unblocking keys

Fixes https://github.com/BerriAI/litellm/issues/5328

* fix(key_management_endpoints.py): fix pop

* feat(auth_checks.py): allow admin to enable/disable virtual keys

Closes https://github.com/BerriAI/litellm/issues/5328

* docs(vertex.md): add auth section for vertex ai

Addresses - https://github.com/BerriAI/litellm/issues/5768#issuecomment-2365284223

* build(model_prices_and_context_window.json): show which models support prompt_caching

Closes https://github.com/BerriAI/litellm/issues/5776

* fix(router.py): allow setting default priority for requests

* fix(router.py): add 'retry-after' header for concurrent request limit errors

Fixes https://github.com/BerriAI/litellm/issues/5783

* fix(router.py): correctly raise and use retry-after header from azure+openai

Fixes https://github.com/BerriAI/litellm/issues/5783

* fix(user_api_key_auth.py): fix valid token being none

* fix(auth_checks.py): fix model dump for cache management object

* fix(user_api_key_auth.py): pass prisma_client to obj

* test(test_otel.py): update test for new key check

* test: fix test
2024-09-21 18:51:53 -07:00
Krish Dholakia
dec53961f7 LiteLLM Minor Fixes and Improvements (11/09/2024) (#5634)
* fix(caching.py): set ttl for async_increment cache

fixes issue where ttl for redis client was not being set on increment_cache

Fixes https://github.com/BerriAI/litellm/issues/5609

* fix(caching.py): fix increment cache w/ ttl for sync increment cache on redis

Fixes https://github.com/BerriAI/litellm/issues/5609

* fix(router.py): support adding retry policy + allowed fails policy via config.yaml

* fix(router.py): don't cooldown single deployments

No point, as there's no other deployment to loadbalance with.

* fix(user_api_key_auth.py): support setting allowed email domains on jwt tokens

Closes https://github.com/BerriAI/litellm/issues/5605

* docs(token_auth.md): add user upsert + allowed email domain to jwt auth docs

* fix(litellm_pre_call_utils.py): fix dynamic key logging when team id is set

Fixes issue where key logging would not be set if team metadata was not none

* fix(secret_managers/main.py): load environment variables correctly

Fixes issue where os.environ/ was not being loaded correctly

* test(test_router.py): fix test

* feat(spend_tracking_utils.py): support logging additional usage params - e.g. prompt caching values for deepseek

* test: fix tests

* test: fix test

* test: fix test

* test: fix test

* test: fix test
2024-09-11 22:36:06 -07:00
Ishaan Jaff
0d6081e370 pass llm provider when creating async httpx clients 2024-09-10 11:51:42 -07:00
Ishaan Jaff
93c1db4a79 rename get_async_httpx_client 2024-09-10 10:38:01 -07:00
Ishaan Jaff
7370a994f5 use correct type hints for audio transcriptions 2024-09-05 09:12:27 -07:00
Krish Dholakia
6fdee99632 LiteLLM Minor fixes + improvements (08/04/2024) (#5505)
* Minor IAM AWS OIDC Improvements (#5246)

* AWS IAM: Temporary tokens are valid across all regions after being issued, so it is wasteful to request one for each region.

* AWS IAM: Include an inline policy, to help reduce misuse of overly permissive IAM roles.

* (test_bedrock_completion.py): Ensure we are testing cross AWS region OIDC flow.

* fix(router.py): log rejected requests

Fixes https://github.com/BerriAI/litellm/issues/5498

* refactor: don't use verbose_logger.exception, if exception is raised

User might already have handling for this. But alerting systems in prod will raise this as an unhandled error.

* fix(datadog.py): support setting datadog source as an env var

Fixes https://github.com/BerriAI/litellm/issues/5508

* docs(logging.md): add dd_source to datadog docs

* fix(proxy_server.py): expose `/customer/list` endpoint for showing all customers

* (bedrock): Fix usage with Cloudflare AI Gateway, and proxies in general. (#5509)

* feat(anthropic.py): support 'cache_control' param for content when it is a string

* Revert "(bedrock): Fix usage with Cloudflare AI Gateway, and proxies in gener…" (#5519)

This reverts commit 3fac0349c2.

* refactor: ci/cd run again

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
2024-09-04 22:16:55 -07:00
Ishaan Jaff
67510d2c59 dual cache use always read redis as True by default 2024-09-04 08:01:55 -07:00
Ishaan Jaff
b6009233ac fix always read redis 2024-09-02 21:08:32 -07:00
Ishaan Jaff
6b642ef0f0 fix allow qdrant api key to be optional 2024-08-30 11:13:23 -07:00
Krrish Dholakia
b277086cf7 feat(vertex_ai_context_caching.py): check gemini cache, if key already exists 2024-08-26 22:19:01 -07:00
Ishaan Jaff
cad77c5969 fix should_use_cache 2024-08-24 09:37:41 -07:00
Ishaan Jaff
e37fe1f9e0 feat - allow setting cache mode 2024-08-24 09:03:59 -07:00
Krrish Dholakia
33c9c16388 feat(caching.py): redis cluster support
Closes https://github.com/BerriAI/litellm/issues/4358
2024-08-21 15:01:52 -07:00
Ishaan Jaff
b196f41d64 fix qdrant litellm on proxy 2024-08-21 12:52:29 -07:00
Ishaan Jaff
8c83fb3f34 fixes for using qdrant with litellm proxy 2024-08-21 12:36:41 -07:00
Ishaan Jaff
0f3274b074 fix drant url 2024-08-21 12:09:09 -07:00
Ishaan Jaff
a34aeafdb5 Merge pull request #5018 from haadirakhangi/main
Qdrant Semantic Caching
2024-08-21 08:50:43 -07:00
Haadi Rakhangi
9df92923d8 implemented RestAPI and added support for cloud and local Qdrant clusters 2024-08-19 20:46:30 +05:30
Krrish Dholakia
2874b94fb1 refactor: replace .error() with .exception() logging for better debugging on sentry 2024-08-16 09:22:47 -07:00
prd-tuong-nguyen
70f2e84bc4 feat: hash prompt when caching 2024-08-08 16:19:14 +07:00
Ishaan Jaff
5b7d1b0ae4 caching use file_checksum 2024-08-06 13:03:14 -07:00
Krrish Dholakia
d526a12080 fix(init.py): rename feature_flag 2024-08-05 11:23:20 -07:00
Krrish Dholakia
8500f6d087 feat(caching.py): enable caching on provider-specific optional params
Closes https://github.com/BerriAI/litellm/issues/5049
2024-08-05 11:18:59 -07:00
Ishaan Jaff
d122508385 use file name when getting cache key 2024-08-02 14:52:08 -07:00
Haadi Rakhangi
a047df3825 qdrant semantic caching added 2024-08-02 21:07:19 +05:30
Krrish Dholakia
a75b70fbd6 fix(caching.py): support /completion caching by default
updates supported call types in redis cache to cover text_completion caching
2024-07-29 08:19:30 -07:00
Ishaan Jaff
aade38760d use common helpers for writing to otel 2024-07-27 11:40:39 -07:00
Ishaan Jaff
40f9e67be4 move _get_parent_otel_span_from_kwargs to otel.py 2024-07-27 11:12:13 -07:00
Ishaan Jaff
7c489856e3 add doc string to explain what delete cache does 2024-07-13 12:25:31 -07:00
Ishaan Jaff
9d657c42d8 de-ref unused cache items 2024-07-12 16:38:36 -07:00
Krrish Dholakia
a79cb33960 fix(caching.py): fix async redis health check 2024-07-06 09:14:29 -07:00
Ishaan Jaff
a1968eaf3f remove debug print statement 2024-06-27 20:58:29 -07:00
Ishaan Jaff
5977b5be20 ci/cd add debugging for cache eviction 2024-06-25 08:14:09 -07:00
Ishaan Jaff
f800425744 fix default ttl for InMemoryCache 2024-06-24 21:21:38 -07:00
Ishaan Jaff
3ebf1ec7eb feat use custom eviction policy 2024-06-24 20:28:03 -07:00
Ishaan Jaff
4e8f2a57e0 fix install on python 3.8 2024-06-24 17:27:14 -07:00
Ishaan Jaff
5b19aac705 cleanup InMemoryCache 2024-06-24 17:24:59 -07:00
Ishaan Jaff
4f03556af6 use lru cache 2024-06-24 17:15:53 -07:00
Ishaan Jaff
81ef2c38dc fix InMemoryCache 2024-06-24 17:08:30 -07:00
Ishaan Jaff
21fd91fe94 fix use caching lib 2024-06-24 17:03:23 -07:00
Ishaan Jaff
0c4c6bfa5e fix in mem cache tests 2024-06-22 19:52:18 -07:00
Ishaan Jaff
8e3a073323 Merge branch 'main' into litellm_fix_in_mem_usage 2024-06-22 19:23:37 -07:00
Ishaan Jaff
5b2d4da43f fix caching clear in memory cache mem util 2024-06-22 19:21:37 -07:00
Ishaan Jaff
c4ae06576b fix - clean up in memory cache 2024-06-22 18:46:30 -07:00
Krrish Dholakia
0430807178 feat(dynamic_rate_limiter.py): update cache with active project 2024-06-21 20:25:40 -07:00
David Manouchehri
47e3880638 fix(caching.py): Stop throwing constant spam errors on every single S3 cache miss. Fixes #4146. 2024-06-13 20:58:18 +00:00
Ishaan Jaff
786e6b4ae3 feat - final working redis cache otel 2024-06-07 16:36:04 -07:00
Ishaan Jaff
72a6d49b21 feat - working exception logs for Redis errors 2024-06-07 16:30:29 -07:00