Commit graph

40 commits

Author SHA1 Message Date
Ishaan Jaff
dee6de0105 (testing) Router add testing coverage (#6253)
* test: add more router code coverage

* test: additional router testing coverage

* fix: fix linting error

* test: fix tests for ci/cd

* test: fix test

* test: handle flaky tests

---------

Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
2024-10-16 07:32:27 -07:00
Ishaan Jaff
b3dadc7f83 (router testing) Add testing coverage for run_async_fallback and run_sync_fallback (#6256)
* add type hints for run_async_fallback

* fix async fallback doc string

* test run_async_fallback
2024-10-16 16:16:17 +05:30
Ishaan Jaff
5218a140f0 (testing - litellm.Router ) add unit test coverage for pattern matching / wildcard routing (#6250)
* add testing coverage for pattern match router

* fix add_pattern

* fix typo on router_cooldown_event_callback

* add testing for pattern match router

* fix add explanation for pattern match router
2024-10-16 11:58:05 +05:30
Ishaan Jaff
ece65164fb (refactor router.py ) - PR 3 - Ensure all functions under 100 lines (#6181)
* add flake 8 check

* split up litellm _acompletion

* fix get model client

* refactor use commong func to add metadata to kwargs

* use common func to get timeout

* re-use helper to _get_async_model_client

* use _handle_mock_testing_rate_limit_error

* fix docstring for _handle_mock_testing_rate_limit_error

* fix function_with_retries

* use helper for mock testing fallbacks

* router - use 1 func for simple_shuffle

* add doc string for simple_shuffle

* use 1 function for filtering cooldown deployments

* fix use common helper to _get_fallback_model_group_from_fallbacks
2024-10-14 21:27:54 +05:30
Ishaan Jaff
ba56e37244 (refactor) caching use LLMCachingHandler for async_get_cache and set_cache (#6208)
* use folder for caching

* fix importing caching

* fix clickhouse pyright

* fix linting

* fix correctly pass kwargs and args

* fix test case for embedding

* fix linting

* fix embedding caching logic

* fix refactor handle utils.py

* fix test_embedding_caching_azure_individual_items_reordered
2024-10-14 16:34:01 +05:30
Krish Dholakia
17fa7c17ec LiteLLM Minor Fixes & Improvements (10/10/2024) (#6158)
* refactor(vertex_ai_partner_models/anthropic): refactor anthropic to use partner model logic

* fix(vertex_ai/): support passing custom api base to partner models

Fixes https://github.com/BerriAI/litellm/issues/4317

* fix(proxy_server.py): Fix prometheus premium user check logic

* docs(prometheus.md): update quick start docs

* fix(custom_llm.py): support passing dynamic api key + api base

* fix(realtime_api/main.py): Add request/response logging for realtime api endpoints

Closes https://github.com/BerriAI/litellm/issues/6081

* feat(openai/realtime): add openai realtime api logging

Closes https://github.com/BerriAI/litellm/issues/6081

* fix(realtime_streaming.py): fix linting errors

* fix(realtime_streaming.py): fix linting errors

* fix: fix linting errors

* fix pattern match router

* Add literalai in the sidebar observability category (#6163)

* fix: add literalai in the sidebar

* fix: typo

* update (#6160)

* Feat: Add Langtrace integration (#5341)

* Feat: Add Langtrace integration

* add langtrace service name

* fix timestamps for traces

* add tests

* Discard Callback + use existing otel logger

* cleanup

* remove print statments

* remove callback

* add docs

* docs

* add logging docs

* format logging

* remove emoji and add litellm proxy example

* format logging

* format `logging.md`

* add langtrace docs to logging.md

* sync conflict

* docs fix

* (perf) move s3 logging to Batch logging + async [94% faster perf under 100 RPS on 1 litellm instance] (#6165)

* fix move s3 to use customLogger

* add basic s3 logging test

* add s3 to custom logger compatible

* use batch logger for s3

* s3 set flush interval and batch size

* fix s3 logging

* add notes on s3 logging

* fix s3 logging

* add basic s3 logging test

* fix s3 type errors

* add test for sync logging on s3

* fix: fix to debug log

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Willy Douhard <willy.douhard@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
Co-authored-by: Ali Waleed <ali@scale3labs.com>
2024-10-11 23:04:36 -07:00
Ishaan Jaff
4aaffc6276 fix pattern match router 2024-10-11 12:12:57 +05:30
Ishaan Jaff
6b5f19299b (feat) use regex pattern matching for wildcard routing (#6150)
* use pattern matching for llm deployments

* code quality fix

* fix linting

* add types to PatternMatchRouter

* docs add example config for regex patterns
2024-10-10 18:24:16 +05:30
Krish Dholakia
f7e896bdc5 Add pyright to ci/cd + Fix remaining type-checking errors (#6082)
* fix: fix type-checking errors

* fix: fix additional type-checking errors

* fix: additional type-checking error fixes

* fix: fix additional type-checking errors

* fix: additional type-check fixes

* fix: fix all type-checking errors + add pyright to ci/cd

* fix: fix incorrect import

* ci(config.yml): use mypy on ci/cd

* fix: fix type-checking errors in utils.py

* fix: fix all type-checking errors on main.py

* fix: fix mypy linting errors

* fix(anthropic/cost_calculator.py): fix linting errors

* fix: fix mypy linting errors

* fix: fix linting errors
2024-10-05 17:04:00 -04:00
Ishaan Jaff
6138f0c705 fix prometheus track cooldown events on custom logger (#6060) 2024-10-04 16:56:22 +05:30
Krish Dholakia
94a05ca5d0 Litellm ruff linting enforcement (#5992)
* ci(config.yml): add a 'check_code_quality' step

Addresses https://github.com/BerriAI/litellm/issues/5991

* ci(config.yml): check why circle ci doesn't pick up this test

* ci(config.yml): fix to run 'check_code_quality' tests

* fix(__init__.py): fix unprotected import

* fix(__init__.py): don't remove unused imports

* build(ruff.toml): update ruff.toml to ignore unused imports

* fix: fix: ruff + pyright - fix linting + type-checking errors

* fix: fix linting errors

* fix(lago.py): fix module init error

* fix: fix linting errors

* ci(config.yml): cd into correct dir for checks

* fix(proxy_server.py): fix linting error

* fix(utils.py): fix bare except

causes ruff linting errors

* fix: ruff - fix remaining linting errors

* fix(clickhouse.py): use standard logging object

* fix(__init__.py): fix unprotected import

* fix: ruff - fix linting errors

* fix: fix linting errors

* ci(config.yml): cleanup code qa step (formatting handled in local_testing)

* fix(_health_endpoints.py): fix ruff linting errors

* ci(config.yml): just use ruff in check_code_quality pipeline for now

* build(custom_guardrail.py): include missing file

* style(embedding_handler.py): fix ruff check
2024-10-01 19:44:20 -04:00
Ishaan Jaff
97aeacc1fa (feat proxy prometheus) track virtual key, key alias, error code, error code class on prometheus (#5968)
* track api key and team in prom latency metric

* add test for latency metric

* test prometheus success metrics for latency

* track team and key labels for deployment failures

* add test for litellm_deployment_failure_responses_total

* fix checks for premium user on prometheus

* log_success_fallback_event and log_failure_fallback_event

* log original_exception in log_success_fallback_event

* track key, team and exception status and class on fallback metrics

* use get_standard_logging_metadata

* fix import error

* track litellm_deployment_successful_fallbacks

* add test test_proxy_fallback_metrics

* add log log_success_fallback_event

* fix test prometheus
2024-09-28 19:00:21 -07:00
Ishaan Jaff
6368c0925e [Feat-Prometheus] Track exception status on litellm_deployment_failure_responses (#5706)
* add litellm_deployment_cooled_down

* track num cooldowns on prometheus

* track exception status

* fix linting

* docs prom metrics

* cleanup premium user checks

* prom track deployment failure state

* docs prometheus
2024-09-14 18:44:31 -07:00
Ishaan Jaff
8f155327f6 [Fix] Router cooldown logic - use % thresholds instead of allowed fails to cooldown deployments (#5698)
* move cooldown logic to it's own helper

* add new track deployment metrics folder

* increment success, fails for deployment in current minute

* fix cooldown logic

* fix test_aaarouter_dynamic_cooldown_message_retry_time

* fix test_single_deployment_no_cooldowns_test_prod_mock_completion_calls

* clean up get from deployment test

* fix _async_get_healthy_deployments

* add mock InternalServerError

* test deployment failing 25% requests

* add test_high_traffic_cooldowns_one_bad_deployment

* fix vertex load test

* add test for rate limit error models in cool down

* change default cooldown time

* fix cooldown message time

* fix cooldown on 429 error

* fix doc string for _should_cooldown_deployment

* fix sync cooldown logic router
2024-09-14 18:01:19 -07:00
Ishaan Jaff
6f68e860e0 fix import error 2024-09-05 10:09:44 -07:00
Ishaan Jaff
b5d1d93c14 refactor secret managers 2024-09-03 10:58:02 -07:00
Krish Dholakia
18da7adce9 feat(router.py): Support Loadbalancing batch azure api endpoints (#5469)
* feat(router.py): initial commit for loadbalancing azure batch api endpoints

Closes https://github.com/BerriAI/litellm/issues/5396

* fix(router.py): working `router.acreate_file()`

* feat(router.py): working router.acreate_batch endpoint

* feat(router.py): expose router.aretrieve_batch function

Make it easy for user to retrieve the batch information

* feat(router.py): support 'router.alist_batches' endpoint

Adds support for getting all batches across all endpoints

* feat(router.py): working loadbalancing on `/v1/files`

* feat(proxy_server.py): working loadbalancing on `/v1/batches`

* feat(proxy_server.py): working loadbalancing on Retrieve + List batch
2024-09-02 21:32:55 -07:00
Krish Dholakia
3fbb4f8fac Azure Service Principal with Secret authentication workflow. (#5131) (#5437)
* Azure Service Principal with Secret authentication workflow. (#5131)

* Implement Azure Service Principal with Secret authentication workflow.

* Use `ClientSecretCredential` instead of `DefaultAzureCredential`.

* Move imports into the function.

* Add type hint for `azure_ad_token_provider`.

* Add unit test for router initialization and sample completion using Azure Service Principal with Secret authentication workflow.

* Add unit test for router initialization with neither API key nor using Azure Service Principal with Secret authentication workflow.

* fix(client_initializtion_utils.py): fix typing + overrides

* test: fix linting errors

* fix(client_initialization_utils.py): fix client init azure ad token logic

* fix(router_client_initialization.py): add flag check for reading azure ad token from environment

* test(test_streaming.py): skip end of life bedrock model

* test(test_router_client_init.py): add correct flag to test

---------

Co-authored-by: kzych-inpost <142029278+kzych-inpost@users.noreply.github.com>
2024-09-02 14:29:00 -07:00
Krrish Dholakia
7243e6c8ce fix(cooldown_cache.py): fix linting errors 2024-08-27 07:40:28 -07:00
Krrish Dholakia
4918082cd7 fix(cooldown_cache.py): fix linting errors 2024-08-24 17:11:32 -07:00
Krrish Dholakia
c795e9feeb fix(router.py): enable dynamic retry after in exception string
Updates cooldown logic to cooldown individual models

 Closes https://github.com/BerriAI/litellm/issues/1339
2024-08-24 16:59:30 -07:00
Ishaan Jaff
2203dad29e fix azure_ad_token_provider 2024-08-22 16:15:53 -07:00
Ishaan Jaff
b16752f0bc add new litellm params for client_id, tenant_id etc 2024-08-22 11:37:30 -07:00
Ishaan Jaff
525d152d85 use azure_ad_token_provider to init clients 2024-08-22 11:03:49 -07:00
Ishaan Jaff
165e0e3ad1 fix run sync fallbacks 2024-08-20 12:55:36 -07:00
Ishaan Jaff
078fe97053 fix fallbacks dont recurse on the same fallback 2024-08-20 12:50:20 -07:00
Marc Abramowitz
4e2e8101c6 Use AZURE_API_VERSION as default azure openai version
Without this change, the default version of the Azure OpenAI API is hardcoded in
the code as an old version, `"2024-02-01"`. This change allows the user to set
the default version of the Azure OpenAI API by setting the environment variable
`AZURE_API_VERSION` or by using the command-line parameter `--api_version`.
2024-08-14 15:47:57 -07:00
Ishaan Jaff
ada3ae670b feat - log fallbacks events on prometheus 2024-08-10 13:57:25 -07:00
Ishaan Jaff
5ccc358c2a v0 add event handlers for logging fallback events 2024-08-10 13:28:08 -07:00
Ishaan Jaff
408d17dfee refactor prom metrics 2024-08-09 09:02:23 -07:00
Ishaan Jaff
27e8a89077 fix logging cool down deployment 2024-08-07 11:27:05 -07:00
Ishaan Jaff
0dd8f50477 use router_cooldown_handler 2024-08-07 10:40:55 -07:00
Ishaan Jaff
037f737aa1 Revert "[Ui] add together AI, Mistral, PerplexityAI, OpenRouter models on Admin UI " 2024-07-20 19:04:22 -07:00
Ishaan Jaff
368e0109a4 router fix init openai compatible providers 2024-07-19 19:42:04 -07:00
Krrish Dholakia
8275e29030 fix(client_initialization_utils.py): fix import logic 2024-07-06 19:28:38 -07:00
Krrish Dholakia
71ea98d5ea fix(client_initialization_utils.py): fix merge conflicts 2024-07-06 19:20:28 -07:00
Ishaan Jaff
37312f08f2 fix should_initialize_sync_client 2024-07-06 13:10:22 -07:00
Ishaan Jaff
f6eccf84ce use helper for init client + check if we should init sync clients 2024-07-06 12:52:41 -07:00
Ishaan Jaff
fcbd496ed1 fix use safe access for router alerting 2024-06-14 15:17:32 -07:00
Ishaan Jaff
bd341c69b5 fix - send alert on router level exceptions 2024-06-14 08:41:12 -07:00