Commit graph

2156 commits

Author SHA1 Message Date
Krrish Dholakia
fdc4fdb91a fix(proxy/utils.py): fix slack alerting to only raise alerts for llm api exceptions
don't spam for bad user requests. Closes https://github.com/BerriAI/litellm/issues/3395
2024-05-02 17:18:21 -07:00
Marc Abramowitz
988c37fda3 Disambiguate invalid model name errors
because that error can be thrown in several different places, so
knowing the function it's being thrown from can be very useul for debugging.
2024-05-02 15:02:54 -07:00
Krrish Dholakia
acda064be6 fix(proxy/utils.py): fix retry logic for generic data request 2024-05-02 14:50:50 -07:00
Lunik
6cec252b07
feat: Add Azure Content-Safety Proxy hooks
Signed-off-by: Lunik <lunik@tiwabbit.fr>
2024-05-02 23:21:08 +02:00
Krish Dholakia
762a1fbd50
Merge pull request #3375 from msabramo/GH-3372
Fix route `/openai/deployments/{model}/chat/completions` not working properly
2024-05-02 13:00:25 -07:00
Krrish Dholakia
0251543e7a refactor(main.py): trigger new build 2024-05-01 21:59:33 -07:00
Ishaan Jaff
761aa7e5c8 ui - new build 2024-05-01 21:43:00 -07:00
Krish Dholakia
fffbb73465
Merge branch 'main' into litellm_openmeter_integration 2024-05-01 21:19:29 -07:00
Krrish Dholakia
cdd3e1eef3 build(ui): enable adding openmeter via proxy ui 2024-05-01 21:16:23 -07:00
Ishaan Jaff
00969aa682 ui - new build 2024-05-01 19:52:57 -07:00
Krrish Dholakia
2a9651b3ca feat(openmeter.py): add support for user billing
open-meter supports user based billing. Closes https://github.com/BerriAI/litellm/issues/1268
2024-05-01 17:23:48 -07:00
Ishaan Jaff
26eda88b26 feat - show slow count and total count 2024-05-01 17:18:14 -07:00
Ishaan Jaff
f48f4a767c feat - return slow responses on admin UI 2024-05-01 17:16:33 -07:00
Ishaan Jaff
e9dd4bbe57 fix - dont show cache hits on model latency tracker 2024-05-01 16:51:15 -07:00
Ishaan Jaff
2b467a847a fix latency tracking tool tip 2024-05-01 16:47:30 -07:00
Ishaan Jaff
adf3e90f45 ui - new build 2024-05-01 13:32:32 -07:00
Ishaan Jaff
b3a788142b
Merge pull request #3380 from BerriAI/ui_polish_viewing_model_latencies
[UI] Polish viewing Model Latencies
2024-05-01 09:44:53 -07:00
Ishaan Jaff
94b98f5c4e clean up model latency metrics 2024-05-01 08:27:01 -07:00
Krrish Dholakia
d0f9f8c0ed fix(proxy/utils.py): emit number of spend transactions for keys being written to db in a batch 2024-05-01 08:25:04 -07:00
Ishaan Jaff
fc5a845838 fix - prisma schema 2024-04-30 23:09:53 -07:00
Ishaan Jaff
1e94d53a9b (ui - new build) 2024-04-30 22:54:51 -07:00
Ishaan Jaff
b9238a00af ui - show tokens / sec 2024-04-30 22:44:28 -07:00
Ishaan Jaff
0c464f7f61 fix - viewing model metrics 2024-04-30 18:26:14 -07:00
Ishaan Jaff
f2849d0641 fix - track litellm_model_name in LiteLLM_ErrorLogs 2024-04-30 17:31:40 -07:00
Ishaan Jaff
8a1a043801 backend - show model latency per token 2024-04-30 17:23:36 -07:00
Ishaan Jaff
a2a8fef8f4 fix passing starttime and endtime to model/exceptions 2024-04-30 16:53:53 -07:00
Ishaan Jaff
26a5d85869 fix - backend return exceptions 2024-04-30 15:41:16 -07:00
Marc Abramowitz
dd166680d1 Move chat_completions before completions
so that the `chat_completions` route is defined before the `completions` route.
This is necessary because the `chat_completions` route is more
specific than the `completions` route, and the order of route definitions
matters in FastAPI.

Without this, doing a request to
`/openai/deployments/{model_in_url}/chat/completions` might trigger
`completions` being called (with `model` set to `{model_in_url}/chat` instead of
`chat_completions` getting called, which is the correct function.

Fixes: GH-3372
2024-04-30 15:07:10 -07:00
Ishaan Jaff
1f4f1c6f70 stash /model/metrics/exceptions endpoints 2024-04-30 14:19:23 -07:00
Ishaan Jaff
4b8fda4ac4 log startTime and EndTime for exceptions 2024-04-30 13:34:14 -07:00
Ishaan Jaff
3aad034a8b feat log request kwargs in error logs 2024-04-30 13:28:26 -07:00
Ishaan Jaff
ad5fddef15 fix log model_group 2024-04-30 13:11:09 -07:00
Ishaan Jaff
ee2a2ce559 fix - log api_base in errors 2024-04-30 13:02:42 -07:00
Ishaan Jaff
06804bc70a fix - working exception writing 2024-04-30 12:48:17 -07:00
Ishaan Jaff
22725bd44d fix types for errorLog 2024-04-30 12:31:33 -07:00
Ishaan Jaff
ac1cabe963 add LiteLLM_ErrorLogs to types 2024-04-30 12:16:03 -07:00
Krrish Dholakia
5fe0f38558 docs(load_test.md): load test multiple instances of the proxy w/ tpm/rpm limits on deployments 2024-04-29 15:58:14 -07:00
Krrish Dholakia
7b617e666d fix(proxy_server.py): return more detailed auth error message. 2024-04-29 07:24:19 -07:00
CyanideByte
82be9a7e67
Merge branch 'BerriAI:main' into main 2024-04-27 20:51:33 -07:00
CyanideByte
03a43b99a5 Added _types.py cases from edwinjosegeorge PR#3340 2024-04-27 20:42:54 -07:00
Ishaan Jaff
de8f928bdd ui - new build 2024-04-27 17:28:30 -07:00
Krrish Dholakia
d9e0d7ce52 test: replace flaky endpoint 2024-04-27 16:37:09 -07:00
Ishaan Jaff
e49fe47d2e fix - only run global_proxy_spend on chat completion calls 2024-04-27 14:11:00 -07:00
Krish Dholakia
1a06f009d1
Merge branch 'main' into litellm_default_router_retries 2024-04-27 11:21:57 -07:00
Krrish Dholakia
e05764bdb7 fix(router.py): add /v1/ if missing to base url, for openai-compatible api's
Fixes https://github.com/BerriAI/litellm/issues/2279
2024-04-26 17:05:07 -07:00
Krish Dholakia
4b0f73500f
Merge branch 'main' into litellm_default_router_retries 2024-04-26 14:52:24 -07:00
Krrish Dholakia
5583197d63 fix(proxy_server.py): fix setting offset-aware datetime 2024-04-25 21:18:32 -07:00
Ishaan Jaff
1bb82ef42f ui -new build 2024-04-25 20:33:02 -07:00
Krish Dholakia
40b6b4794b
Merge pull request #3310 from BerriAI/litellm_langfuse_error_logging_2
fix(proxy/utils.py): log rejected proxy requests to langfuse
2024-04-25 19:49:59 -07:00
Krrish Dholakia
885de2e3c6 fix(proxy/utils.py): log rejected proxy requests to langfuse 2024-04-25 19:26:27 -07:00