Commit graph

1258 commits

Author SHA1 Message Date
Krrish Dholakia
db666b01e5 feat(proxy_server.py): add CRUD endpoints for 'end_user' management
allow admin to specify region + default models for end users
2024-05-08 18:50:36 -07:00
Krish Dholakia
91bb7cd261
Merge pull request #3437 from msabramo/add-engines-model-chat-completions-endpoint
Add `/engines/{model}/chat/completions` endpoint
2024-05-08 14:30:39 -07:00
Krish Dholakia
0e709fdc21
Merge branch 'main' into litellm_ui_fixes_6 2024-05-07 22:01:04 -07:00
Krrish Dholakia
fbcda918de feat(ui/model_dashboard.tsx): show if model is config or db model 2024-05-07 21:34:18 -07:00
Krrish Dholakia
5a16bec6a1 feat(model_dashboard.tsx): allow user to edit input cost per token for model on ui
also contains fixes for `/model/update`
2024-05-07 20:57:21 -07:00
Krrish Dholakia
312249ca44 feat(ui/model_dashboard.tsx): show if model is config or db model 2024-05-07 18:29:14 -07:00
Krish Dholakia
2aaaa5e1b4
Merge pull request #3506 from BerriAI/litellm_reintegrate_langfuse_url_slack_alert
feat(slack_alerting.py): reintegrate langfuse trace url for slack alerts
2024-05-07 15:03:29 -07:00
Krrish Dholakia
f210318bf1 fix(proxy_server.py): return budget duration in user response object 2024-05-07 13:47:32 -07:00
Krrish Dholakia
f2766fddbf fix(proxy_server.py): fix /v1/models bug where it would return empty list
handle 'all-team-models' being set for a given key
2024-05-07 13:43:15 -07:00
Krrish Dholakia
872470ff1f feat(slack_alerting.py): reintegrate langfuse trace url for slack alerts
this ensures langfuse trace url returned in llm api exception err
2024-05-07 12:58:49 -07:00
Ishaan Jaff
bfef424b39 fix don't let slack alert block /model/new 2024-05-06 20:47:29 -07:00
Ishaan Jaff
eb84c69ec6 fix - /model/new 2024-05-06 20:45:17 -07:00
Krish Dholakia
aa62d891a0
Merge branch 'main' into litellm_slack_daily_reports 2024-05-06 19:31:20 -07:00
Krrish Dholakia
26c0ed0f2d refactor(proxy_server.py): show ttl's on a top-level enum
Addresses - https://github.com/BerriAI/litellm/issues/2649#issuecomment-2097203372
2024-05-06 18:43:42 -07:00
Krrish Dholakia
6b9b4f05ba feat(proxy_server.py): schedule slack daily report if enabled
if user enabled daily_reports, send them a slack report every 12 hours
2024-05-06 18:25:48 -07:00
Ishaan Jaff
c600371e6e feat - send alert on adding new model 2024-05-06 15:45:07 -07:00
Ishaan Jaff
562ef2d2e1 fix - add better debugging on num_callbacks test 2024-05-06 13:42:20 -07:00
Ishaan Jaff
fccdb92c6b fix - select startTime and endTime on UI 2024-05-03 21:20:19 -07:00
Marc Abramowitz
eb433bde86 Add route: "/engines/{model:path}/chat/completions"
Without this, it results in:

```pytb
Traceback (most recent call last):
  File "/Users/abramowi/Code/OpenSource/litellm/litellm/proxy/proxy_server.py", line 3836, in completion
    raise HTTPException(
fastapi.exceptions.HTTPException: 400: {'error': 'completion: Invalid model name passed in model=gpt-3.5-turbo/chat'}
```
2024-05-03 18:02:29 -07:00
Ishaan Jaff
e7034ea53d feat - filter exceptions by model group 2024-05-03 16:54:24 -07:00
Ishaan Jaff
3dd1e8dfe7
Merge pull request #3427 from BerriAI/litellm_test_alert_size
[Test] - Ensure only 1 slack callback + Size of of all callbacks do not grow
2024-05-03 16:27:16 -07:00
Krish Dholakia
1b35a75245
Merge pull request #3430 from BerriAI/litellm_return_api_base
feat(proxy_server.py): return api base in response headers
2024-05-03 16:25:21 -07:00
Krrish Dholakia
5b39f8e282 feat(proxy_server.py): return api base in response headers
Closes https://github.com/BerriAI/litellm/issues/2631
2024-05-03 15:27:32 -07:00
Ishaan Jaff
ab27866b6a fix test slack alerting len 2024-05-03 14:58:11 -07:00
Ishaan Jaff
3997ea6442 fix - return num callbacks in /active/callbacks 2024-05-03 14:24:01 -07:00
Ishaan Jaff
e99edaf4e1
Merge pull request #3426 from BerriAI/litellm_set_db_exceptions_on_ui
UI - set DB Exceptions webhook_url on UI
2024-05-03 14:05:37 -07:00
Ishaan Jaff
776f541f6c fix bug where slack would get inserting several times 2024-05-03 14:04:38 -07:00
Ishaan Jaff
23d334fe60 proxy - return num callbacks on /health/readiness 2024-05-03 09:14:32 -07:00
Marc Abramowitz
988c37fda3 Disambiguate invalid model name errors
because that error can be thrown in several different places, so
knowing the function it's being thrown from can be very useul for debugging.
2024-05-02 15:02:54 -07:00
Krish Dholakia
762a1fbd50
Merge pull request #3375 from msabramo/GH-3372
Fix route `/openai/deployments/{model}/chat/completions` not working properly
2024-05-02 13:00:25 -07:00
Krish Dholakia
fffbb73465
Merge branch 'main' into litellm_openmeter_integration 2024-05-01 21:19:29 -07:00
Krrish Dholakia
cdd3e1eef3 build(ui): enable adding openmeter via proxy ui 2024-05-01 21:16:23 -07:00
Ishaan Jaff
26eda88b26 feat - show slow count and total count 2024-05-01 17:18:14 -07:00
Ishaan Jaff
f48f4a767c feat - return slow responses on admin UI 2024-05-01 17:16:33 -07:00
Ishaan Jaff
e9dd4bbe57 fix - dont show cache hits on model latency tracker 2024-05-01 16:51:15 -07:00
Ishaan Jaff
2b467a847a fix latency tracking tool tip 2024-05-01 16:47:30 -07:00
Ishaan Jaff
94b98f5c4e clean up model latency metrics 2024-05-01 08:27:01 -07:00
Ishaan Jaff
b9238a00af ui - show tokens / sec 2024-04-30 22:44:28 -07:00
Ishaan Jaff
0c464f7f61 fix - viewing model metrics 2024-04-30 18:26:14 -07:00
Ishaan Jaff
f2849d0641 fix - track litellm_model_name in LiteLLM_ErrorLogs 2024-04-30 17:31:40 -07:00
Ishaan Jaff
8a1a043801 backend - show model latency per token 2024-04-30 17:23:36 -07:00
Ishaan Jaff
a2a8fef8f4 fix passing starttime and endtime to model/exceptions 2024-04-30 16:53:53 -07:00
Ishaan Jaff
26a5d85869 fix - backend return exceptions 2024-04-30 15:41:16 -07:00
Marc Abramowitz
dd166680d1 Move chat_completions before completions
so that the `chat_completions` route is defined before the `completions` route.
This is necessary because the `chat_completions` route is more
specific than the `completions` route, and the order of route definitions
matters in FastAPI.

Without this, doing a request to
`/openai/deployments/{model_in_url}/chat/completions` might trigger
`completions` being called (with `model` set to `{model_in_url}/chat` instead of
`chat_completions` getting called, which is the correct function.

Fixes: GH-3372
2024-04-30 15:07:10 -07:00
Ishaan Jaff
1f4f1c6f70 stash /model/metrics/exceptions endpoints 2024-04-30 14:19:23 -07:00
Ishaan Jaff
4b8fda4ac4 log startTime and EndTime for exceptions 2024-04-30 13:34:14 -07:00
Ishaan Jaff
3aad034a8b feat log request kwargs in error logs 2024-04-30 13:28:26 -07:00
Ishaan Jaff
ad5fddef15 fix log model_group 2024-04-30 13:11:09 -07:00
Ishaan Jaff
ee2a2ce559 fix - log api_base in errors 2024-04-30 13:02:42 -07:00
Ishaan Jaff
06804bc70a fix - working exception writing 2024-04-30 12:48:17 -07:00