Commit graph

551 commits

Author SHA1 Message Date
Krish Dholakia
4ac66bd843
LiteLLM Minor Fixes and Improvements (09/07/2024) (#5580)
* fix(litellm_logging.py): set completion_start_time_float to end_time_float if none

Fixes https://github.com/BerriAI/litellm/issues/5500

* feat(_init_.py): add new 'openai_text_completion_compatible_providers' list

Fixes https://github.com/BerriAI/litellm/issues/5558

Handles correctly routing fireworks ai calls when done via text completions

* fix: fix linting errors

* fix: fix linting errors

* fix(openai.py): fix exception raised

* fix(openai.py): fix error handling

* fix(_redis.py): allow all supported arguments for redis cluster (#5554)

* Revert "fix(_redis.py): allow all supported arguments for redis cluster (#5554)" (#5583)

This reverts commit f2191ef4cb.

* fix(router.py): return model alias w/ underlying deployment on router.get_model_list()

Fixes https://github.com/BerriAI/litellm/issues/5524#issuecomment-2336410666

* test: handle flaky tests

---------

Co-authored-by: Jonas Dittrich <58814480+Kakadus@users.noreply.github.com>
2024-09-09 18:54:17 -07:00
Elad Segal
da30da9a97
Properly use allowed_fails_policy when it has fields with a value of 0 (#5604) 2024-09-09 16:35:12 -07:00
Krrish Dholakia
0a016d33e6 Revert "fix(router.py): return model alias w/ underlying deployment on router.get_model_list()"
This reverts commit 638896309c.
2024-09-07 18:04:56 -07:00
Krrish Dholakia
638896309c fix(router.py): return model alias w/ underlying deployment on router.get_model_list()
Fixes https://github.com/BerriAI/litellm/issues/5524#issuecomment-2336410666
2024-09-07 18:01:31 -07:00
Krish Dholakia
72e961af3c
LiteLLM Minor Fixes and Improvements (08/06/2024) (#5567)
* fix(utils.py): return citations for perplexity streaming

Fixes https://github.com/BerriAI/litellm/issues/5535

* fix(anthropic/chat.py): support fallbacks for anthropic streaming (#5542)

* fix(anthropic/chat.py): support fallbacks for anthropic streaming

Fixes https://github.com/BerriAI/litellm/issues/5512

* fix(anthropic/chat.py): use module level http client if none given (prevents early client closure)

* fix: fix linting errors

* fix(http_handler.py): fix raise_for_status error handling

* test: retry flaky test

* fix otel type

* fix(bedrock/embed): fix error raising

* test(test_openai_batches_and_files.py): skip azure batches test (for now) quota exceeded

* fix(test_router.py): skip azure batch route test (for now) - hit batch quota limits

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>

* All `model_group_alias` should show up in `/models`, `/model/info` , `/model_group/info` (#5539)

* fix(router.py): support returning model_alias model names in `/v1/models`

* fix(proxy_server.py): support returning model alias'es on `/model/info`

* feat(router.py): support returning model group alias for `/model_group/info`

* fix(proxy_server.py): fix linting errors

* fix(proxy_server.py): fix linting errors

* build(model_prices_and_context_window.json): add amazon titan text premier pricing information

Closes https://github.com/BerriAI/litellm/issues/5560

* feat(litellm_logging.py): log standard logging response object for pass through endpoints. Allows bedrock /invoke agent calls to be correctly logged to langfuse + s3

* fix(success_handler.py): fix linting error

* fix(success_handler.py): fix linting errors

* fix(team_endpoints.py): Allows admin to update team member budgets

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-09-06 17:16:24 -07:00
Ishaan Jaff
81ee1653af use correct type hints for audio transcriptions 2024-09-05 09:12:27 -07:00
Krish Dholakia
1e7e538261
LiteLLM Minor fixes + improvements (08/04/2024) (#5505)
* Minor IAM AWS OIDC Improvements (#5246)

* AWS IAM: Temporary tokens are valid across all regions after being issued, so it is wasteful to request one for each region.

* AWS IAM: Include an inline policy, to help reduce misuse of overly permissive IAM roles.

* (test_bedrock_completion.py): Ensure we are testing cross AWS region OIDC flow.

* fix(router.py): log rejected requests

Fixes https://github.com/BerriAI/litellm/issues/5498

* refactor: don't use verbose_logger.exception, if exception is raised

User might already have handling for this. But alerting systems in prod will raise this as an unhandled error.

* fix(datadog.py): support setting datadog source as an env var

Fixes https://github.com/BerriAI/litellm/issues/5508

* docs(logging.md): add dd_source to datadog docs

* fix(proxy_server.py): expose `/customer/list` endpoint for showing all customers

* (bedrock): Fix usage with Cloudflare AI Gateway, and proxies in general. (#5509)

* feat(anthropic.py): support 'cache_control' param for content when it is a string

* Revert "(bedrock): Fix usage with Cloudflare AI Gateway, and proxies in gener…" (#5519)

This reverts commit 3fac0349c2.

* refactor: ci/cd run again

---------

Co-authored-by: David Manouchehri <david.manouchehri@ai.moda>
2024-09-04 22:16:55 -07:00
Krish Dholakia
9f3fa29624
feat(router.py): Support Loadbalancing batch azure api endpoints (#5469)
* feat(router.py): initial commit for loadbalancing azure batch api endpoints

Closes https://github.com/BerriAI/litellm/issues/5396

* fix(router.py): working `router.acreate_file()`

* feat(router.py): working router.acreate_batch endpoint

* feat(router.py): expose router.aretrieve_batch function

Make it easy for user to retrieve the batch information

* feat(router.py): support 'router.alist_batches' endpoint

Adds support for getting all batches across all endpoints

* feat(router.py): working loadbalancing on `/v1/files`

* feat(proxy_server.py): working loadbalancing on `/v1/batches`

* feat(proxy_server.py): working loadbalancing on Retrieve + List batch
2024-09-02 21:32:55 -07:00
Krish Dholakia
e0d81434ed
LiteLLM minor fixes + improvements (31/08/2024) (#5464)
* fix(vertex_endpoints.py): fix vertex ai pass through endpoints

* test(test_streaming.py): skip model due to end of life

* feat(custom_logger.py): add special callback for model hitting tpm/rpm limits

Closes https://github.com/BerriAI/litellm/issues/4096
2024-09-01 13:31:42 -07:00
Krish Dholakia
dd7b008161
fix: Minor LiteLLM Fixes + Improvements (29/08/2024) (#5436)
* fix(model_checks.py): support returning wildcard models on `/v1/models`

Fixes https://github.com/BerriAI/litellm/issues/4903

* fix(bedrock_httpx.py): support calling bedrock via api_base

Closes https://github.com/BerriAI/litellm/pull/4587

* fix(litellm_logging.py): only leave last 4 char of gemini key unmasked

Fixes https://github.com/BerriAI/litellm/issues/5433

* feat(router.py): support setting 'weight' param for models on router

Closes https://github.com/BerriAI/litellm/issues/5410

* test(test_bedrock_completion.py): add unit test for custom api base

* fix(model_checks.py): handle no "/" in model
2024-08-29 22:40:25 -07:00
Krrish Dholakia
f0fb8bdf45 fix(router.py): fix cooldown check 2024-08-28 16:38:42 -07:00
Ishaan Jaff
5f2f7aa754 feat - add rerank on proxy 2024-08-27 17:36:40 -07:00
Krrish Dholakia
deff357c92 fix(router.py): fix aembedding type hints
Fixes https://github.com/BerriAI/litellm/issues/5383
2024-08-27 14:29:18 -07:00
Krrish Dholakia
33972cc79c fix(router.py): enable dynamic retry after in exception string
Updates cooldown logic to cooldown individual models

 Closes https://github.com/BerriAI/litellm/issues/1339
2024-08-24 16:59:30 -07:00
Krrish Dholakia
068aafdff9 fix(utils.py): correctly re-raise the headers from an exception, if present
Fixes issue where retry after on router was not using azure / openai numbers
2024-08-24 12:30:30 -07:00
Krrish Dholakia
0b06a76cf9 fix(router.py): don't cooldown on apiconnectionerrors
Fixes issue where model would be in cooldown due to api connection errors
2024-08-24 09:53:05 -07:00
Krrish Dholakia
008fa494a7 fix(router.py): fix linting error 2024-08-21 15:35:10 -07:00
Ishaan Jaff
c25a69fa78 test test_using_default_working_fallback 2024-08-20 13:32:55 -07:00
Ishaan Jaff
f6d97c25f2 fix run sync fallbacks 2024-08-20 12:55:36 -07:00
Ishaan Jaff
e4b5e88a57 fix fallbacks dont recurse on the same fallback 2024-08-20 12:50:20 -07:00
Ishaan Jaff
e28b240a5b fix don't retry errors when no healthy deployments available 2024-08-20 12:17:05 -07:00
Ishaan Jaff
19c3a82d1b test + never retry on 404 errors 2024-08-20 11:59:43 -07:00
Ishaan Jaff
08db691dec use model access groups for teams 2024-08-17 16:45:53 -07:00
Krrish Dholakia
61f4b71ef7 refactor: replace .error() with .exception() logging for better debugging on sentry 2024-08-16 09:22:47 -07:00
Ishaan Jaff
0238ab077d v0 track fallback events 2024-08-10 13:31:00 -07:00
Krrish Dholakia
7b6db63d30 fix(router.py): fallback on 400-status code requests 2024-08-09 12:16:49 -07:00
Krrish Dholakia
400653992c feat(router.py): allow using .acompletion() for request prioritization
allows /chat/completion endpoint to work for request prioritization calls
2024-08-07 16:43:12 -07:00
Ishaan Jaff
9cd437135b fix getting provider_specific_deployment 2024-08-07 15:20:59 -07:00
Ishaan Jaff
f1ffa82062 fix use provider specific routing 2024-08-07 14:37:20 -07:00
Ishaan Jaff
5d7a1b2ec6 router use provider specific wildcard routing 2024-08-07 14:12:10 -07:00
Ishaan Jaff
18305b23f4 add + test provider specific routing 2024-08-07 13:49:46 -07:00
Krrish Dholakia
f0f900d69e fix(router.py): add reason for fallback failure to client-side exception string
make it easier to debug why a fallback failed to occur
2024-08-07 13:02:47 -07:00
Ishaan Jaff
d1e519afd1 use router_cooldown_handler 2024-08-07 10:40:55 -07:00
Krrish Dholakia
ce39649b2a fix: fix test to specify allowed_fails 2024-08-05 21:39:59 -07:00
Krrish Dholakia
7a0792c918 fix(router.py): move deployment cooldown list message to error log, not client-side
don't show user all deployments
2024-08-03 12:49:39 -07:00
Krrish Dholakia
6b8806b45f feat(router.py): add flag for mock testing loadbalancing for rate limit errors 2024-08-03 12:34:11 -07:00
Krrish Dholakia
c65a438de2 fix(utils.py): fix linting errors 2024-07-30 18:38:10 -07:00
Krrish Dholakia
ec6db03c41 fix(router.py): gracefully handle scenario where completion response doesn't have total tokens
Closes https://github.com/BerriAI/litellm/issues/4968
2024-07-30 15:14:03 -07:00
Krrish Dholakia
b25d4a8cb3 feat(ollama_chat.py): support ollama tool calling
Closes https://github.com/BerriAI/litellm/issues/4812
2024-07-26 21:51:54 -07:00
Krrish Dholakia
84482703b8 docs(config.md): update wildcard docs 2024-07-26 08:59:53 -07:00
Ishaan Jaff
8f4c5437b8 router support setting pass_through_all_models 2024-07-25 18:34:12 -07:00
Krrish Dholakia
711496e260 fix(router.py): add support for diskcache to router 2024-07-25 14:30:46 -07:00
Ishaan Jaff
28bb2919b6 fix - test router debug logs 2024-07-20 18:45:31 -07:00
Ishaan Jaff
4038b3dcea router - use verbose logger when using litellm.Router 2024-07-20 17:36:25 -07:00
Ishaan Jaff
08adda7091 control using enable_tag_filtering 2024-07-18 19:39:04 -07:00
Ishaan Jaff
4d0fbfea83 router - refactor to tag based routing 2024-07-18 19:22:09 -07:00
Ishaan Jaff
4b96cd46b2
Merge pull request #4786 from BerriAI/litellm_use_model_tier_keys
[Feat-Enterprise] Use free/paid tiers for Virtual Keys
2024-07-18 18:07:09 -07:00
Krrish Dholakia
b23a633cf1 fix(utils.py): fix status code in exception mapping 2024-07-18 18:04:59 -07:00
Ishaan Jaff
64e38562d9 router - use free paid tier routing 2024-07-18 17:09:42 -07:00
Krrish Dholakia
0a94953896 fix(router.py): check for request_timeout in acompletion
support 'request_timeout' param in router acompletion
2024-07-17 17:19:06 -07:00