Commit graph

4168 commits

Author SHA1 Message Date
ishaan-jaff
4099340ecb (fix) rename proxy startup test 2024-02-06 11:27:24 -08:00
Ishaan Jaff
7cb69c72c8
Merge branch 'main' into litellm_add_semantic_cache 2024-02-06 11:18:43 -08:00
ishaan-jaff
8175fb4deb (fix) mark semantic caching as beta test 2024-02-06 11:04:19 -08:00
ishaan-jaff
405a44727c (ci/cd) run in verbose mode 2024-02-06 10:57:20 -08:00
ishaan-jaff
1afdf5cf36 (fix) semantic caching 2024-02-06 10:55:15 -08:00
ishaan-jaff
c8a83bb745 (fix) test-semantic caching 2024-02-06 10:39:44 -08:00
ishaan-jaff
2732c47b70 (feat) redis-semantic cache on proxy 2024-02-06 10:35:21 -08:00
ishaan-jaff
a1fc1e49c7 (fix) use semantic cache on proxy 2024-02-06 10:27:33 -08:00
ishaan-jaff
05f379234d allow setting redis_semantic cache_embedding model 2024-02-06 10:22:02 -08:00
Krrish Dholakia
d1db67890c fix(ollama.py): support format for ollama 2024-02-06 10:11:52 -08:00
ishaan-jaff
751fb1af89 (feat) log semantic_sim to langfuse 2024-02-06 09:31:57 -08:00
Krrish Dholakia
3afa5230d6 fix(utils.py): return finish reason for last vertex ai chunk 2024-02-06 09:21:03 -08:00
ishaan-jaff
70a895329e (feat) working semantic cache on proxy 2024-02-06 08:55:25 -08:00
ishaan-jaff
a3b1e3bc84 (feat) redis-semantic cache 2024-02-06 08:54:36 -08:00
ishaan-jaff
6249a97098 (feat) working semantic-cache on litellm proxy 2024-02-06 08:52:57 -08:00
ishaan-jaff
a125ffe190 (test) async semantic cache 2024-02-06 08:14:54 -08:00
ishaan-jaff
76def20ffe (feat) RedisSemanticCache - async 2024-02-06 08:13:12 -08:00
Krrish Dholakia
9e091a0624 fix(ollama_chat.py): explicitly state if ollama call is streaming or not 2024-02-06 07:43:47 -08:00
Krrish Dholakia
c2a523b954 fix(utils.py): use print_verbose for statements, so debug can be seen when running sdk 2024-02-06 07:30:26 -08:00
Krrish Dholakia
2e3748e6eb fix(ollama_chat.py): fix ollama chat completion token counting 2024-02-06 07:30:26 -08:00
ishaan-jaff
47bed68c7f (fix) test_normal_router_tpm_limit 2024-02-06 06:46:49 -08:00
ishaan-jaff
9a8abdb1ae (ci/cd) print debug info for test_proxy_gunicorn_startup_config_dict 2024-02-05 22:53:31 -08:00
ishaan-jaff
4d625818d6 (fix) proxy startup test 2024-02-05 22:51:11 -08:00
ishaan-jaff
71814d8149 (feat) proxy - upperbound params /key/generate 2024-02-05 22:40:52 -08:00
ishaan-jaff
4d4554b0e4 (test) test_upperbound_key_params 2024-02-05 22:39:36 -08:00
ishaan-jaff
a712596d46 (feat) upperbound_key_generate_params 2024-02-05 22:38:47 -08:00
ishaan-jaff
d4fd287617 (docs) upperbound_key_generate_params 2024-02-05 22:37:05 -08:00
Krrish Dholakia
7a0bccf4d0 test(test_key_generate_dynamodb.py): fix test 2024-02-05 21:44:50 -08:00
Krrish Dholakia
a9a4f4cf0f test(test_key_generate_dynamodb.py): fix test 2024-02-05 21:43:17 -08:00
Krish Dholakia
3d29ec126b
Merge pull request #1837 from BerriAI/litellm_langfuse_failure_cost_tracking
fix(langfuse.py): support logging failed llm api calls to langfuse
2024-02-05 19:46:40 -08:00
Krrish Dholakia
f2a7e2ee98 feat(ui): enable admin to view all valid keys created on the proxy 2024-02-05 19:28:57 -08:00
ishaan-jaff
ccc94128d3 (fix) semantic cache 2024-02-05 18:25:22 -08:00
ishaan-jaff
81f8ac00b2 (test) semantic caching 2024-02-05 18:22:50 -08:00
ishaan-jaff
cf4bd1cf4e (test) semantic cache 2024-02-05 17:58:32 -08:00
ishaan-jaff
1b39454a08 (feat) working - sync semantic caching 2024-02-05 17:58:12 -08:00
Krrish Dholakia
e35a7c32cb fix(proxy/utils.py): if langfuse trace id passed in, just send that as part of alert 2024-02-05 16:34:33 -08:00
Krrish Dholakia
3b9ada07e0 fix(main.py): raise better error message for health check models without mode 2024-02-05 16:26:25 -08:00
ishaan-jaff
1f7c8e86a7 (fix) make sure route is str 2024-02-05 16:22:36 -08:00
Krrish Dholakia
a1bbb16ab2 fix(langfuse.py): support logging failed llm api calls to langfuse 2024-02-05 16:16:15 -08:00
ishaan-jaff
2b588a8786 (test) litellm-dashboard never allowed to /chat/completions 2024-02-05 16:11:33 -08:00
ishaan-jaff
8d7698f24d (fix) litellm-ui keys can never access /chat/completions 2024-02-05 16:10:49 -08:00
Krrish Dholakia
77fe71ee08 fix(utils.py): support together ai function calling 2024-02-05 15:30:44 -08:00
ishaan-jaff
7557a2535a (fix) model_prices 2024-02-05 15:04:39 -08:00
ishaan-jaff
70f36073dc (fix) pre commit hook to sync backup context_window mapping 2024-02-05 15:03:04 -08:00
ishaan-jaff
9cbc412c78 (fix) fix backup.json 2024-02-05 14:36:07 -08:00
ishaan-jaff
d4a799a3ca (feat )add semantic cache 2024-02-05 12:28:21 -08:00
David Manouchehri
26e68c3f67
(feat) Add sessionId for Langfuse. 2024-02-05 15:13:21 -05:00
Krrish Dholakia
1bdb332454 fix(utils.py): handle count response tokens false case token counting 2024-02-05 08:47:10 -08:00
Ishaan Jaff
14c9e239a1
Merge pull request #1750 from vanpelt/patch-2
Re-raise exception in async ollama streaming
2024-02-05 08:12:17 -08:00
Krish Dholakia
640572647a
Merge pull request #1805 from BerriAI/litellm_cost_tracking_image_gen
feat(utils.py): support cost tracking for openai/azure image gen models
2024-02-03 22:23:22 -08:00