Commit graph

10692 commits

Author SHA1 Message Date
Ishaan Jaff
d6f7fa7f4e v0 prisma schema 2024-04-30 11:42:17 -07:00
Krrish Dholakia
1cd24d8906 bump: version 1.35.32 → 1.35.33 2024-04-30 07:20:50 -07:00
Krrish Dholakia
020b175ef4 fix(lowest_tpm_rpm_v2.py): skip if item_tpm is None 2024-04-29 21:34:25 -07:00
Ishaan Jaff
81df36b298 docs - slack alerting 2024-04-29 21:33:03 -07:00
Ishaan Jaff
b1e888edad docs example logging to langfuse 2024-04-29 21:26:27 -07:00
Ishaan Jaff
0cad58f5c6 docs logging to langfuse on proxy 2024-04-29 21:26:15 -07:00
Ishaan Jaff
0c99ae9451 docs - fix kub.yaml config on docs 2024-04-29 21:20:29 -07:00
Krrish Dholakia
b46db8b891 feat(utils.py): json logs for raw request sent by litellm
make it easier to view verbose logs in datadog
2024-04-29 19:21:19 -07:00
Krrish Dholakia
f0e48cdd53 fix(router.py): raise better exception when no deployments are available
Fixes https://github.com/BerriAI/litellm/issues/3355
2024-04-29 18:48:04 -07:00
Krrish Dholakia
1e53c06064 test(test_router_caching.py): remove unstable test
test would fail due to timing issues
2024-04-29 18:37:31 -07:00
Krrish Dholakia
e7b4882e97 fix(router.py): fix high-traffic bug for usage-based-routing-v2 2024-04-29 16:48:01 -07:00
Krish Dholakia
09bae3d8ad
Merge pull request #3351 from elisalimli/main
Fix Cohere tool calling
2024-04-29 16:45:48 -07:00
Krish Dholakia
32534b5e91
Merge pull request #3358 from sumanth13131/usage-based-routing-RPM-fix
usage based routing RPM count fix
2024-04-29 16:45:25 -07:00
Krrish Dholakia
bd79e8b516 docs(langfuse_integration.md): add 'existing_trace_id' to langfuse docs 2024-04-29 16:40:38 -07:00
Krrish Dholakia
853b70aba9 fix(langfuse.py): support 'existing_trace_id' param
allow user to call out a trace as pre-existing, this prevents creating a default trace name, and potentially overwriting past traces
2024-04-29 16:39:17 -07:00
Krrish Dholakia
2cf069befb fix(langfuse.py): don't set default trace_name if trace_id given 2024-04-29 16:39:17 -07:00
Ishaan Jaff
d58dd2cbeb
Merge pull request #3360 from BerriAI/litellm_random_pick_lowest_latency
[Fix] Lowest Latency routing - random pick deployments when all latencies=0
2024-04-29 16:31:32 -07:00
Krrish Dholakia
77f155d158 docs(load_test.md): cleanup docs 2024-04-29 16:27:58 -07:00
Krrish Dholakia
af6a21f27c docs(load_test.md): add multi-instance router load test to docs 2024-04-29 16:25:56 -07:00
Ishaan Jaff
4cb4a7f06d fix - lowest latency routing 2024-04-29 16:02:57 -07:00
Krrish Dholakia
8f830bd948 docs(load_test.md): simplify doc 2024-04-29 16:00:02 -07:00
Krrish Dholakia
fcb83781ec docs(load_test.md): formatting 2024-04-29 15:58:41 -07:00
Krrish Dholakia
5fe0f38558 docs(load_test.md): load test multiple instances of the proxy w/ tpm/rpm limits on deployments 2024-04-29 15:58:14 -07:00
Ishaan Jaff
3b0aa05378 fix lowest latency - routing 2024-04-29 15:51:52 -07:00
Ishaan Jaff
5247d7b6a5 test - lowest latency router 2024-04-29 15:51:01 -07:00
Krrish Dholakia
cef2d95bb4 docs(routing.md): add max parallel requests to router docs 2024-04-29 15:37:48 -07:00
Krrish Dholakia
a978f2d881 fix(lowest_tpm_rpm_v2.py): shuffle deployments with same tpm values 2024-04-29 15:23:47 -07:00
Krrish Dholakia
f10a066d36 fix(lowest_tpm_rpm_v2.py): add more detail to 'No deployments available' error message 2024-04-29 15:04:37 -07:00
Ishaan Jaff
de3e642999
Merge pull request #3359 from BerriAI/litellm_docs_trackin_cost
docs - update track cost with custom callbacks
2024-04-29 13:19:25 -07:00
Ishaan Jaff
8d26030b99 docs - track cost custom callbacks 2024-04-29 13:15:08 -07:00
sumanth
89e655c79e usage based routing RPM count fix 2024-04-30 00:29:38 +05:30
Krrish Dholakia
4b04b017df bump: version 1.35.31 → 1.35.32 2024-04-29 09:16:44 -07:00
Krish Dholakia
ec2510029a
Merge pull request #3354 from BerriAI/litellm_replicate_cost_tracking
fix(utils.py): replicate now also has token based pricing for some models
2024-04-29 09:13:41 -07:00
Krrish Dholakia
3725732c4d fix(utils.py): default to time-based tracking for unmapped replicate models. fix time-based cost calc for replicate 2024-04-29 08:36:01 -07:00
Krrish Dholakia
a18844b230 fix(utils.py): use llama tokenizer for replicate models 2024-04-29 08:28:31 -07:00
Krrish Dholakia
dc5c175406 build(model_prices_and_context_window.json): add token-based replicate costs to model cost map 2024-04-29 08:20:44 -07:00
Krrish Dholakia
ab954243e8 fix(utils.py): fix watson streaming 2024-04-29 08:09:59 -07:00
Krrish Dholakia
2cfb97141d fix(utils.py): replicate now also has token based pricing for some models 2024-04-29 08:06:15 -07:00
alisalim17
0aa8b94ff5 test: completion with Cohere command-r-plus model 2024-04-29 18:38:12 +04:00
Krrish Dholakia
0a6b6302f1 fix(router.py): fix typing error 2024-04-29 07:25:39 -07:00
Krrish Dholakia
7b617e666d fix(proxy_server.py): return more detailed auth error message. 2024-04-29 07:24:19 -07:00
Krish Dholakia
ffc6af0b22
Merge pull request #3334 from CyanideByte/main
protected_namespaces warning fixed for model_name & model_info
2024-04-29 07:16:05 -07:00
alisalim17
0db7fa3fd8 fix: cohere tool results 2024-04-29 14:20:24 +04:00
Krrish Dholakia
f74a43aa78 docs(vllm.md): update docs to tell people to check openai-compatible endpoint docs for vllm 2024-04-28 09:48:03 -07:00
Krrish Dholakia
1f6c342e94 test: fix test 2024-04-28 09:45:01 -07:00
Krish Dholakia
39244cd517
Merge pull request #3331 from BerriAI/litellm_common_auth_params
feat(utils.py): unify common auth params across azure/vertex_ai/bedrock/watsonx
2024-04-28 09:27:07 -07:00
Krish Dholakia
1841b74f49
Merge branch 'main' into litellm_common_auth_params 2024-04-28 08:38:06 -07:00
Krrish Dholakia
b9c0b55e7c test: fix test - set num_retries=0 2024-04-27 21:02:19 -07:00
CyanideByte
82be9a7e67
Merge branch 'BerriAI:main' into main 2024-04-27 20:51:33 -07:00
CyanideByte
03a43b99a5 Added _types.py cases from edwinjosegeorge PR#3340 2024-04-27 20:42:54 -07:00