ishaan-jaff
|
0c488a7cc7
|
(feat) working semantic-cache on litellm proxy
|
2024-02-06 13:30:55 -08:00 |
|
ishaan-jaff
|
ca5f83d5fe
|
(test) async semantic cache
|
2024-02-06 13:30:55 -08:00 |
|
ishaan-jaff
|
97fbfc07b4
|
(feat) RedisSemanticCache - async
|
2024-02-06 13:30:55 -08:00 |
|
ishaan-jaff
|
05d64acdd3
|
(fix) semantic cache
|
2024-02-06 13:30:55 -08:00 |
|
ishaan-jaff
|
5d6af6b7d6
|
(test) semantic caching
|
2024-02-06 13:30:55 -08:00 |
|
ishaan-jaff
|
b8a1cc4e42
|
(test) semantic cache
|
2024-02-06 13:30:55 -08:00 |
|
ishaan-jaff
|
bc22fda64e
|
(feat) working - sync semantic caching
|
2024-02-06 13:30:55 -08:00 |
|
ishaan-jaff
|
fa1681fe7a
|
(feat )add semantic cache
|
2024-02-06 13:30:55 -08:00 |
|
ishaan-jaff
|
e6e340cd46
|
(feat) show langfuse logging tags better through proxy
|
2024-02-06 13:30:55 -08:00 |
|
Krrish Dholakia
|
cde96b8ba0
|
test(test_completion.py): fix test
|
2024-02-06 13:30:31 -08:00 |
|
ishaan-jaff
|
667e006dbf
|
(ci/cd) run again
|
2024-02-06 13:30:31 -08:00 |
|
ishaan-jaff
|
548d56ac42
|
(fix) mark semantic caching as beta test
|
2024-02-06 13:30:31 -08:00 |
|
ishaan-jaff
|
ca26ecc4dc
|
(fix) semantic caching
|
2024-02-06 13:30:31 -08:00 |
|
ishaan-jaff
|
b2c78108c8
|
(fix) test-semantic caching
|
2024-02-06 13:30:31 -08:00 |
|
ishaan-jaff
|
a8d190f507
|
(feat) redis-semantic cache on proxy
|
2024-02-06 13:30:31 -08:00 |
|
ishaan-jaff
|
63e91eb2c4
|
(fix) use semantic cache on proxy
|
2024-02-06 13:30:31 -08:00 |
|
ishaan-jaff
|
000afc1391
|
allow setting redis_semantic cache_embedding model
|
2024-02-06 13:30:31 -08:00 |
|
ishaan-jaff
|
74fb4e4cba
|
(feat) log semantic_sim to langfuse
|
2024-02-06 13:30:31 -08:00 |
|
ishaan-jaff
|
a7b3148c99
|
(feat) working semantic cache on proxy
|
2024-02-06 13:30:30 -08:00 |
|
ishaan-jaff
|
2b86c3a9b4
|
(feat) redis-semantic cache
|
2024-02-06 13:30:17 -08:00 |
|
ishaan-jaff
|
ba0d92c712
|
(feat) working semantic-cache on litellm proxy
|
2024-02-06 13:30:17 -08:00 |
|
ishaan-jaff
|
87b5b418d8
|
(test) async semantic cache
|
2024-02-06 13:30:17 -08:00 |
|
ishaan-jaff
|
b2f9dde360
|
(feat) RedisSemanticCache - async
|
2024-02-06 13:30:17 -08:00 |
|
ishaan-jaff
|
b63fe39ed2
|
(fix) semantic cache
|
2024-02-06 13:30:17 -08:00 |
|
ishaan-jaff
|
4450be0a64
|
(test) semantic caching
|
2024-02-06 13:30:17 -08:00 |
|
ishaan-jaff
|
80e1d901d8
|
(test) semantic cache
|
2024-02-06 13:30:17 -08:00 |
|
ishaan-jaff
|
80865f93b8
|
(feat) working - sync semantic caching
|
2024-02-06 13:30:17 -08:00 |
|
ishaan-jaff
|
168a2f7806
|
(feat )add semantic cache
|
2024-02-06 13:30:17 -08:00 |
|
ishaan-jaff
|
a73d57b32b
|
(feat) show langfuse logging tags better through proxy
|
2024-02-06 13:30:17 -08:00 |
|
Krrish Dholakia
|
fa5f4b9774
|
test(test_completion.py): fix test
|
2024-02-06 13:29:47 -08:00 |
|
ishaan-jaff
|
79c225a60f
|
(ci/cd) run again
|
2024-02-06 13:26:48 -08:00 |
|
Ishaan Jaff
|
8a8f538329
|
Merge pull request #1829 from BerriAI/litellm_add_semantic_cache
[Feat] Add Semantic Caching to litellm💰
|
2024-02-06 13:18:59 -08:00 |
|
Krrish Dholakia
|
420d2754d7
|
fix(utils.py): round max tokens to be int always
|
2024-02-06 13:17:57 -08:00 |
|
ishaan-jaff
|
bc6d29f879
|
(ci/cd) run again
|
2024-02-06 13:17:57 -08:00 |
|
ishaan-jaff
|
1bcd2eafd2
|
(ci/cd) run again
|
2024-02-06 13:17:57 -08:00 |
|
ishaan-jaff
|
86c84d72e5
|
(ci/cd) fix test_config_no_auth
|
2024-02-06 13:17:57 -08:00 |
|
ishaan-jaff
|
0f6a9242ec
|
(fix) test_normal_router_tpm_limit
|
2024-02-06 13:17:57 -08:00 |
|
ishaan-jaff
|
f8491feebd
|
(fix) parallel_request_limiter debug
|
2024-02-06 13:17:57 -08:00 |
|
ishaan-jaff
|
43cb836c4f
|
(ci/cd) run again
|
2024-02-06 13:17:57 -08:00 |
|
ishaan-jaff
|
b1b5daf73d
|
(fix) proxy_startup test
|
2024-02-06 13:17:57 -08:00 |
|
ishaan-jaff
|
ec5b812989
|
(fix) rename proxy startup test
|
2024-02-06 13:17:57 -08:00 |
|
ishaan-jaff
|
249482b3f7
|
(ci/cd) run in verbose mode
|
2024-02-06 13:17:57 -08:00 |
|
Krrish Dholakia
|
4d76af89f3
|
fix(ollama.py): support format for ollama
|
2024-02-06 13:17:57 -08:00 |
|
Krrish Dholakia
|
f9b5e9ea62
|
fix(ollama_chat.py): explicitly state if ollama call is streaming or not
|
2024-02-06 13:17:57 -08:00 |
|
Krrish Dholakia
|
34fcb3c984
|
fix(utils.py): use print_verbose for statements, so debug can be seen when running sdk
|
2024-02-06 13:17:57 -08:00 |
|
Krrish Dholakia
|
3409ac7690
|
fix(ollama_chat.py): fix ollama chat completion token counting
|
2024-02-06 13:17:57 -08:00 |
|
ishaan-jaff
|
f2070d025e
|
(fix) test_normal_router_tpm_limit
|
2024-02-06 13:17:57 -08:00 |
|
ishaan-jaff
|
6ea17be098
|
(ci/cd) print debug info for test_proxy_gunicorn_startup_config_dict
|
2024-02-06 13:17:57 -08:00 |
|
ishaan-jaff
|
a24041b624
|
(fix) proxy startup test
|
2024-02-06 13:17:57 -08:00 |
|
ishaan-jaff
|
1ef8b459ce
|
(feat) proxy - upperbound params /key/generate
|
2024-02-06 13:17:57 -08:00 |
|