Commit graph

2596 commits

Author SHA1 Message Date
Krrish Dholakia
dfcb6bcbc5 test(test_completion.py): skip sagemaker test - aws account suspended 2024-04-04 09:52:24 -07:00
Krish Dholakia
0c5b8a7667
Merge pull request #2827 from BerriAI/litellm_model_add_api
fix(proxy_server.py): persist models added via `/model/new` to db
2024-04-03 23:30:39 -07:00
Krrish Dholakia
346cd1876b fix: raise correct error 2024-04-03 22:37:51 -07:00
Krrish Dholakia
20849cbbfc fix(router.py): fix pydantic object logic 2024-04-03 21:57:19 -07:00
Krrish Dholakia
ef2f6ef6a2 test(test_acooldowns_router.py): fix tpm 2024-04-03 21:24:42 -07:00
Ishaan Jaff
fa44f45429 (ci/cd) run again 2024-04-03 21:02:08 -07:00
Ishaan Jaff
fb741d96ca test - voyage ai embedding 2024-04-03 20:54:35 -07:00
Krish Dholakia
6bc48d7e8d
Merge branch 'main' into litellm_model_add_api 2024-04-03 20:29:44 -07:00
Krrish Dholakia
f536fb13e6 fix(proxy_server.py): persist models added via /model/new to db
allows models to be used across instances

https://github.com/BerriAI/litellm/issues/2319 , https://github.com/BerriAI/litellm/issues/2329
2024-04-03 20:16:41 -07:00
Ishaan Jaff
d627c90bfd ci/cd run again 2024-04-03 20:13:46 -07:00
Krrish Dholakia
475144e5b7 fix(openai.py): support passing prompt as list instead of concat string 2024-04-03 15:23:20 -07:00
Krrish Dholakia
15e0099948 fix(proxy_server.py): return original model response via response headers - /v1/completions
to help devs with debugging
2024-04-03 13:05:43 -07:00
Krrish Dholakia
f17dd68df3 test(test_text_completion.py): unit testing for text completion pydantic object 2024-04-03 12:26:51 -07:00
Krrish Dholakia
1d341970ba feat(vertex_ai_anthropic.py): add claude 3 on vertex ai support - working .completions call
.completions() call works
2024-04-02 22:07:39 -07:00
Ishaan Jaff
4d76ec43ac
Merge pull request #2808 from BerriAI/litellm_use_all_proxy_team_models_auth
[feat] use `all-proxy-models` and `all-team-models` with Admin UI
2024-04-02 21:48:30 -07:00
Krrish Dholakia
b5ca4cc235 test(test_update_spend.py): fix test with right init 2024-04-02 21:11:26 -07:00
Ishaan Jaff
afd81f1609 test new team request 2024-04-02 20:52:16 -07:00
CLARKBENHAM
44cb0f352a formating 2024-04-02 19:56:07 -07:00
CLARKBENHAM
164898a213 fix lowest latency tests 2024-04-02 19:10:40 -07:00
CLARKBENHAM
29573b0967 param both tests to include failure (also fix prev) 2024-04-02 18:53:42 -07:00
Krrish Dholakia
d7601a4844 perf(proxy_server.py): batch write spend logs
reduces prisma client errors, by batch writing spend logs - max 1k logs at a time
2024-04-02 18:46:55 -07:00
CLARKBENHAM
4f95966475 tests showing error 2024-04-02 18:45:05 -07:00
Ishaan Jaff
21379eb56d
Merge pull request #2801 from BerriAI/litellm_support_all_models_as_a_ui_alias
[UI] use all_models alias
2024-04-02 17:53:25 -07:00
Ishaan Jaff
3245d8cdce support all-proxy-models for teams 2024-04-02 16:04:09 -07:00
Ishaan Jaff
b83c452ddd support all-models-on-proxy 2024-04-02 15:52:54 -07:00
Ishaan Jaff
73ef4780f7 (fix) support all-models alias on backend 2024-04-02 15:12:37 -07:00
Krrish Dholakia
b07788d2a5 fix(openai.py): return logprobs for text completion calls 2024-04-02 14:05:56 -07:00
Krrish Dholakia
0d949d71ab fix(main.py): support text completion input being a list of strings
addresses - https://github.com/BerriAI/litellm/issues/2792, https://github.com/BerriAI/litellm/issues/2777
2024-04-02 08:50:16 -07:00
Ishaan Jaff
92984a1c6f
Merge pull request #2788 from BerriAI/litellm_support_-_models
[Feat] Allow using model = * on proxy config.yaml
2024-04-01 19:46:50 -07:00
Ishaan Jaff
98df2b027b test test_wildcard_openai_routing 2024-04-01 19:46:07 -07:00
Krrish Dholakia
c3e4af76cf refactor: fix linting issue 2024-04-01 18:11:38 -07:00
Krrish Dholakia
6467dd4e11 fix(tpm_rpm_limiter.py): fix cache init logic 2024-04-01 18:01:38 -07:00
Krrish Dholakia
52b1538b2e fix(router.py): support context window fallbacks for pre-call checks 2024-04-01 10:51:54 -07:00
Krrish Dholakia
c9e6b05cfb test(test_max_tpm_rpm_limiter.py): add unit testing for redis namespaces working for tpm/rpm limits 2024-04-01 10:39:03 -07:00
Krrish Dholakia
f3e47323b9 test(test_max_tpm_rpm_limiter.py): unit tests for key + team based tpm rpm limits on proxy 2024-04-01 08:11:30 -07:00
Ishaan Jaff
ddb35facc0 ci/cd run again 2024-04-01 07:40:05 -07:00
Krrish Dholakia
aebb0e489c test: fix test 2024-04-01 07:29:56 -07:00
Krrish Dholakia
583e334bd2 fix(utils.py): set redis_usage_cache to none by default 2024-04-01 07:29:56 -07:00
Krish Dholakia
2ca303ec0e
Merge pull request #2748 from BerriAI/litellm_anthropic_tool_calling_list_parsing_fix
fix(factory.py): parse list in xml tool calling response (anthropic)
2024-03-30 11:27:02 -07:00
Krrish Dholakia
22d5603778 ci(config.yml): add lunary to circle ci 2024-03-29 22:09:21 -07:00
Vincelwt
1b84dfac91
Merge branch 'main' into main 2024-03-30 13:21:53 +09:00
Krrish Dholakia
cbf35087c7 test(test_key_generate_prisma.py): fix test 2024-03-29 20:30:43 -07:00
Krrish Dholakia
3810b050c1 fix(proxy_server.py): increment cached global proxy spend object 2024-03-29 20:02:31 -07:00
Krrish Dholakia
5280fc809f fix(proxy_server.py): enforce end user budgets with 'litellm.max_end_user_budget' param 2024-03-29 17:14:40 -07:00
Krrish Dholakia
bbd94f504c test(test_rules.py): fix assert 2024-03-29 13:12:16 -07:00
Krrish Dholakia
49642a5b00 fix(factory.py): parse list in xml tool calling response (anthropic)
improves tool calling outparsing to check if list in response. Also returns the raw response back to the user via `response._hidden_params["original_response"]`, so user can see exactly what anthropic returned
2024-03-29 11:51:26 -07:00
Krrish Dholakia
109cd93a39 fix(sagemaker.py): support model_id consistently. support dynamic args for async calls 2024-03-29 09:05:00 -07:00
Krrish Dholakia
d547944556 fix(sagemaker.py): support 'model_id' param for sagemaker
allow passing inference component param to sagemaker in the same format as we handle this for bedrock
2024-03-29 08:43:17 -07:00
Krrish Dholakia
cd53291b62 fix(utils.py): support bedrock mistral streaming 2024-03-29 07:56:10 -07:00
Krrish Dholakia
5a117490ec fix(proxy_server.py): fix tpm/rpm limiting for jwt auth
fixes tpm/rpm limiting for jwt auth and implements unit tests for jwt auth
2024-03-28 21:19:34 -07:00