Krish Dholakia
e0d81434ed
LiteLLM minor fixes + improvements (31/08/2024) ( #5464 )
...
* fix(vertex_endpoints.py): fix vertex ai pass through endpoints
* test(test_streaming.py): skip model due to end of life
* feat(custom_logger.py): add special callback for model hitting tpm/rpm limits
Closes https://github.com/BerriAI/litellm/issues/4096
2024-09-01 13:31:42 -07:00
Krish Dholakia
dd7b008161
fix: Minor LiteLLM Fixes + Improvements (29/08/2024) ( #5436 )
...
* fix(model_checks.py): support returning wildcard models on `/v1/models`
Fixes https://github.com/BerriAI/litellm/issues/4903
* fix(bedrock_httpx.py): support calling bedrock via api_base
Closes https://github.com/BerriAI/litellm/pull/4587
* fix(litellm_logging.py): only leave last 4 char of gemini key unmasked
Fixes https://github.com/BerriAI/litellm/issues/5433
* feat(router.py): support setting 'weight' param for models on router
Closes https://github.com/BerriAI/litellm/issues/5410
* test(test_bedrock_completion.py): add unit test for custom api base
* fix(model_checks.py): handle no "/" in model
2024-08-29 22:40:25 -07:00
Krrish Dholakia
f0fb8bdf45
fix(router.py): fix cooldown check
2024-08-28 16:38:42 -07:00
Ishaan Jaff
5f2f7aa754
feat - add rerank on proxy
2024-08-27 17:36:40 -07:00
Krrish Dholakia
deff357c92
fix(router.py): fix aembedding type hints
...
Fixes https://github.com/BerriAI/litellm/issues/5383
2024-08-27 14:29:18 -07:00
Krrish Dholakia
33972cc79c
fix(router.py): enable dynamic retry after in exception string
...
Updates cooldown logic to cooldown individual models
Closes https://github.com/BerriAI/litellm/issues/1339
2024-08-24 16:59:30 -07:00
Krrish Dholakia
068aafdff9
fix(utils.py): correctly re-raise the headers from an exception, if present
...
Fixes issue where retry after on router was not using azure / openai numbers
2024-08-24 12:30:30 -07:00
Krrish Dholakia
0b06a76cf9
fix(router.py): don't cooldown on apiconnectionerrors
...
Fixes issue where model would be in cooldown due to api connection errors
2024-08-24 09:53:05 -07:00
Krrish Dholakia
008fa494a7
fix(router.py): fix linting error
2024-08-21 15:35:10 -07:00
Ishaan Jaff
c25a69fa78
test test_using_default_working_fallback
2024-08-20 13:32:55 -07:00
Ishaan Jaff
f6d97c25f2
fix run sync fallbacks
2024-08-20 12:55:36 -07:00
Ishaan Jaff
e4b5e88a57
fix fallbacks dont recurse on the same fallback
2024-08-20 12:50:20 -07:00
Ishaan Jaff
e28b240a5b
fix don't retry errors when no healthy deployments available
2024-08-20 12:17:05 -07:00
Ishaan Jaff
19c3a82d1b
test + never retry on 404 errors
2024-08-20 11:59:43 -07:00
Ishaan Jaff
08db691dec
use model access groups for teams
2024-08-17 16:45:53 -07:00
Krrish Dholakia
61f4b71ef7
refactor: replace .error() with .exception() logging for better debugging on sentry
2024-08-16 09:22:47 -07:00
Ishaan Jaff
0238ab077d
v0 track fallback events
2024-08-10 13:31:00 -07:00
Krrish Dholakia
7b6db63d30
fix(router.py): fallback on 400-status code requests
2024-08-09 12:16:49 -07:00
Krrish Dholakia
400653992c
feat(router.py): allow using .acompletion() for request prioritization
...
allows /chat/completion endpoint to work for request prioritization calls
2024-08-07 16:43:12 -07:00
Ishaan Jaff
9cd437135b
fix getting provider_specific_deployment
2024-08-07 15:20:59 -07:00
Ishaan Jaff
f1ffa82062
fix use provider specific routing
2024-08-07 14:37:20 -07:00
Ishaan Jaff
5d7a1b2ec6
router use provider specific wildcard routing
2024-08-07 14:12:10 -07:00
Ishaan Jaff
18305b23f4
add + test provider specific routing
2024-08-07 13:49:46 -07:00
Krrish Dholakia
f0f900d69e
fix(router.py): add reason for fallback failure to client-side exception string
...
make it easier to debug why a fallback failed to occur
2024-08-07 13:02:47 -07:00
Ishaan Jaff
d1e519afd1
use router_cooldown_handler
2024-08-07 10:40:55 -07:00
Krrish Dholakia
ce39649b2a
fix: fix test to specify allowed_fails
2024-08-05 21:39:59 -07:00
Krrish Dholakia
7a0792c918
fix(router.py): move deployment cooldown list message to error log, not client-side
...
don't show user all deployments
2024-08-03 12:49:39 -07:00
Krrish Dholakia
6b8806b45f
feat(router.py): add flag for mock testing loadbalancing for rate limit errors
2024-08-03 12:34:11 -07:00
Krrish Dholakia
c65a438de2
fix(utils.py): fix linting errors
2024-07-30 18:38:10 -07:00
Krrish Dholakia
ec6db03c41
fix(router.py): gracefully handle scenario where completion response doesn't have total tokens
...
Closes https://github.com/BerriAI/litellm/issues/4968
2024-07-30 15:14:03 -07:00
Krrish Dholakia
b25d4a8cb3
feat(ollama_chat.py): support ollama tool calling
...
Closes https://github.com/BerriAI/litellm/issues/4812
2024-07-26 21:51:54 -07:00
Krrish Dholakia
84482703b8
docs(config.md): update wildcard docs
2024-07-26 08:59:53 -07:00
Ishaan Jaff
8f4c5437b8
router support setting pass_through_all_models
2024-07-25 18:34:12 -07:00
Krrish Dholakia
711496e260
fix(router.py): add support for diskcache to router
2024-07-25 14:30:46 -07:00
Ishaan Jaff
28bb2919b6
fix - test router debug logs
2024-07-20 18:45:31 -07:00
Ishaan Jaff
4038b3dcea
router - use verbose logger when using litellm.Router
2024-07-20 17:36:25 -07:00
Ishaan Jaff
08adda7091
control using enable_tag_filtering
2024-07-18 19:39:04 -07:00
Ishaan Jaff
4d0fbfea83
router - refactor to tag based routing
2024-07-18 19:22:09 -07:00
Ishaan Jaff
4b96cd46b2
Merge pull request #4786 from BerriAI/litellm_use_model_tier_keys
...
[Feat-Enterprise] Use free/paid tiers for Virtual Keys
2024-07-18 18:07:09 -07:00
Krrish Dholakia
b23a633cf1
fix(utils.py): fix status code in exception mapping
2024-07-18 18:04:59 -07:00
Ishaan Jaff
64e38562d9
router - use free paid tier routing
2024-07-18 17:09:42 -07:00
Krrish Dholakia
0a94953896
fix(router.py): check for request_timeout in acompletion
...
support 'request_timeout' param in router acompletion
2024-07-17 17:19:06 -07:00
Ishaan Jaff
e65daef572
router return get_deployment_by_model_group_name
2024-07-15 19:27:12 -07:00
Krish Dholakia
dacce3d78b
Merge pull request #4635 from BerriAI/litellm_anthropic_adapter
...
Anthropic `/v1/messages` endpoint support
2024-07-10 22:41:53 -07:00
Krrish Dholakia
31829855c0
feat(proxy_server.py): working /v1/messages
with config.yaml
...
Adds async router support for adapter_completion call
2024-07-10 18:53:54 -07:00
Ishaan Jaff
62f475919b
feat - add DELETE assistants endpoint
2024-07-10 11:37:37 -07:00
Ishaan Jaff
f5eb862635
router - add acreate_assistants
2024-07-09 09:46:28 -07:00
Krish Dholakia
8661da1980
Merge branch 'main' into litellm_fix_httpx_transport
2024-07-06 19:12:06 -07:00
Ishaan Jaff
2609de43d0
use helper for init client + check if we should init sync clients
2024-07-06 12:52:41 -07:00
Krrish Dholakia
86632f6da0
fix(types/router.py): add custom pricing info to 'model_info'
...
Fixes https://github.com/BerriAI/litellm/issues/4542
2024-07-04 16:07:58 -07:00