litellm/litellm/types/caching.py
Ishaan Jaff c73ce95c01
(feat) - provider budget improvements - ensure provider budgets work with multiple proxy instances + improve latency to ~90ms (#6886)
* use 1 file for duration_in_seconds

* add to readme.md

* re use duration_in_seconds

* fix importing _extract_from_regex, get_last_day_of_month

* fix import

* update provider budget routing

* fix - remove dup test

* add support for using in multi instance environments

* test_in_memory_redis_sync_e2e

* test_in_memory_redis_sync_e2e

* fix test_in_memory_redis_sync_e2e

* fix code quality check

* fix test provider budgets

* working provider budget tests

* add fixture for provider budget routing

* fix router testing for provider budgets

* add comments on provider budget routing

* use RedisPipelineIncrementOperation

* add redis async_increment_pipeline

* use redis async_increment_pipeline

* use lower value for testing

* use redis async_increment_pipeline

* use consistent key name for increment op

* add handling for budget windows

* fix typing async_increment_pipeline

* fix set attr

* add clear doc strings

* unit testing for provider budgets

* test_redis_increment_pipeline
2024-11-24 16:36:19 -08:00

35 lines
675 B
Python

from enum import Enum
from typing import Literal, Optional, TypedDict
class LiteLLMCacheType(str, Enum):
LOCAL = "local"
REDIS = "redis"
REDIS_SEMANTIC = "redis-semantic"
S3 = "s3"
DISK = "disk"
QDRANT_SEMANTIC = "qdrant-semantic"
CachingSupportedCallTypes = Literal[
"completion",
"acompletion",
"embedding",
"aembedding",
"atranscription",
"transcription",
"atext_completion",
"text_completion",
"arerank",
"rerank",
]
class RedisPipelineIncrementOperation(TypedDict):
"""
TypeDict for 1 Redis Pipeline Increment Operation
"""
key: str
increment_value: float
ttl: Optional[int]