Ishaan Jaff
|
4d1b4beb3d
|
(refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208)
* use folder for caching
* fix importing caching
* fix clickhouse pyright
* fix linting
* fix correctly pass kwargs and args
* fix test case for embedding
* fix linting
* fix embedding caching logic
* fix refactor handle utils.py
* fix test_embedding_caching_azure_individual_items_reordered
|
2024-10-14 16:34:01 +05:30 |
|
Krrish Dholakia
|
efc06d4a03
|
fix(batch_redis_get.py): handle custom namespace
Fix https://github.com/BerriAI/litellm/issues/5917
|
2024-09-28 21:08:14 -07:00 |
|
Krrish Dholakia
|
6cca5612d2
|
refactor: replace 'traceback.print_exc()' with logging library
allows error logs to be in json format for otel logging
|
2024-06-06 13:47:43 -07:00 |
|
Krrish Dholakia
|
180cf9bd5c
|
feat(lowest_tpm_rpm_v2.py): move to using redis.incr and redis.mget for getting model usage from redis
makes routing work across multiple instances
|
2024-04-10 14:56:23 -07:00 |
|
Krrish Dholakia
|
8a20ea795b
|
feat(batch_redis_get.py): batch redis GET requests for a given key + call type
reduces number of redis requests. 85ms latency improvement over 3 minutes of load (19k requests).
|
2024-03-15 14:54:16 -07:00 |
|
Krrish Dholakia
|
226953e1d8
|
feat(batch_redis_get.py): batch redis GET requests for a given key + call type
reduces the number of GET requests we're making in high-throughput scenarios
|
2024-03-15 14:40:11 -07:00 |
|