Commit graph

6 commits

Author SHA1 Message Date
Ishaan Jaff
bc46916bb3 (feat - Router / Proxy ) Allow setting budget limits per LLM deployment (#7220)
* fix test_deployment_budget_limits_e2e_test

* refactor async_log_success_event to track spend for provider + deployment

* fix format

* rename class to RouterBudgetLimiting

* rename func

* rename types used for budgets

* add new types for deployment budgets

* add budget limits for deployments

* fix checking budgets set for provider

* update file names

* fix linting error

* _track_provider_remaining_budget_prometheus

* async_filter_deployments

* fix model list passed to router

* update error

* test_deployment_budgets_e2e_test_expect_to_fail

* fix test case

* run deployment budget limits
2024-12-13 19:15:51 -08:00
Ishaan Jaff
fc7a9830ab Provider Budget Routing - Get Budget, Spend Details (#7063)
* add async_get_ttl to dual cache

* add ProviderBudgetResponse

* add provider_budgets

* test_redis_get_ttl

* _init_or_get_provider_budget_in_cache

* test_init_or_get_provider_budget_in_cache

* use _init_provider_budget_in_cache

* test_get_current_provider_budget_reset_at

* doc Get Budget, Spend Details

* doc Provider Budget Routing
2024-12-06 21:14:12 -08:00
Ishaan Jaff
e47ebefced (feat) - provider budget improvements - ensure provider budgets work with multiple proxy instances + improve latency to ~90ms (#6886)
* use 1 file for duration_in_seconds

* add to readme.md

* re use duration_in_seconds

* fix importing _extract_from_regex, get_last_day_of_month

* fix import

* update provider budget routing

* fix - remove dup test

* add support for using in multi instance environments

* test_in_memory_redis_sync_e2e

* test_in_memory_redis_sync_e2e

* fix test_in_memory_redis_sync_e2e

* fix code quality check

* fix test provider budgets

* working provider budget tests

* add fixture for provider budget routing

* fix router testing for provider budgets

* add comments on provider budget routing

* use RedisPipelineIncrementOperation

* add redis async_increment_pipeline

* use redis async_increment_pipeline

* use lower value for testing

* use redis async_increment_pipeline

* use consistent key name for increment op

* add handling for budget windows

* fix typing async_increment_pipeline

* fix set attr

* add clear doc strings

* unit testing for provider budgets

* test_redis_increment_pipeline
2024-11-24 16:36:19 -08:00
Ishaan Jaff
72afed5b7e (QOL improvement) Provider budget routing - allow using 1s, 1d, 1mo, 2mo etc (#6885)
* use 1 file for duration_in_seconds

* add to readme.md

* re use duration_in_seconds

* fix importing _extract_from_regex, get_last_day_of_month

* fix import

* update provider budget routing

* fix - remove dup test
2024-11-23 16:59:46 -08:00
Ishaan Jaff
64b46e32cf (feat) provider budget routing improvements (#6827)
* minor fix for provider budget

* fix raise good error message when budget crossed for provider budget

* fix test provider budgets

* test provider budgets

* feat - emit llm provider spend on prometheus

* test_prometheus_metric_tracking

* doc provider budgets
2024-11-19 21:25:08 -08:00
Ishaan Jaff
ce6465c9df (Feat) Add provider specific budget routing (#6817)
* add ProviderBudgetConfig

* working test_provider_budgets_e2e_test

* test_provider_budgets_e2e_test_expect_to_fail

* use 1 cache read for getting provider spend

* test_provider_budgets_e2e_test

* add doc on provider budgets

* clean up provider budgets

* unit testing for provider budget routing

* use as flag, not routing strat

* fix init provider budget routing

* use async_filter_deployments

* fix test provider budgets

* doc provider budget routing

* doc provider budget routing

* fix docs changes

* fix comment
2024-11-19 20:25:27 -08:00