Commit graph

11 commits

Author SHA1 Message Date
Ishaan Jaff
4828fb1544 fix code quality 2025-02-18 21:29:23 -08:00
Ishaan Jaff
b88762b63c
(Polish/Fixes) - Fixes for Adding Team Specific Models (#8645)
* refactor get model info for team models

* allow adding a model to a team when creating team specific model

* ui update selected Team on Team Dropdown

* test_team_model_association

* testing for team specific models

* test_get_team_specific_model

* test: skip on internal server error

* remove model alias card on teams page

* linting fix _get_team_specific_model

* fix DeploymentTypedDict

* fix linting error

* fix code quality

* fix model info checks

---------

Co-authored-by: Krrish Dholakia <krrishdholakia@gmail.com>
2025-02-18 21:11:57 -08:00
Ishaan Jaff
8024300825
(UI) Improvements to Add Team Model Flow (#8603)
* ui - use common team dropdown component

* re-use team component

* rename org field on add model

* handle add model submit

* working view model_id and team_id on root models page

* cleaner

* show all fields

* working model info view

* working team info selector

* clean up team id

* new component for model dashboard

* ui show table with dropdown

* make public model names like email

* revert changes to litellm model name

* fix litellm model name

* ui fix public model

* fix mappings

* fix conditional text input

* fix message

* ui fix bulk add models

* _add_team_model_to_db

* move model mgmt helper funcs

* test_add_team_model_to_db

* ui - display model team model name

* fix add model tab

* fix remove redundant info tab on models page

* dont pass model mappings all the way through

* fix jarring model name when adding team models

* fix edit model button

* delete button on model info

* ui fix model dashboard

* fix DeploymentTypedDict

* _is_model_access_group_for_wildcard_route

* test _get_public_model_name

* ui fix viewing public model name

* fix linting error

* fix linting errors

* fix selectedModel logic
2025-02-17 18:37:14 -08:00
Ishaan Jaff
8a235e7d38
(Refactor / QA) - Use LoggingCallbackManager to append callbacks and ensure no duplicate callbacks are added (#8112)
* LoggingCallbackManager

* add logging_callback_manager

* use logging_callback_manager

* add add_litellm_failure_callback

* use add_litellm_callback

* use add_litellm_async_success_callback

* add_litellm_async_failure_callback

* linting fix

* fix logging callback manager

* test_duplicate_multiple_loggers_test

* use _reset_all_callbacks

* fix testing with dup callbacks

* test_basic_image_generation

* reset callbacks for tests

* fix check for _add_custom_logger_to_list

* fix test_amazing_sync_embedding

* fix _get_custom_logger_key

* fix batches testing

* fix _reset_all_callbacks

* fix _check_callback_list_size

* add callback_manager_test

* fix test gemini-2.0-flash-thinking-exp-01-21
2025-01-30 19:35:50 -08:00
Krish Dholakia
539f166166
Support budget/rate limit tiers for keys (#7429)
* feat(proxy/utils.py): get associated litellm budget from db in combined_view for key

allows user to create rate limit tiers and associate those to keys

* feat(proxy/_types.py): update the value of key-level tpm/rpm/model max budget metrics with the associated budget table values if set

allows rate limit tiers to be easily applied to keys

* docs(rate_limit_tiers.md): add doc on setting rate limit / budget tiers

make feature discoverable

* feat(key_management_endpoints.py): return litellm_budget_table value in key generate

make it easy for user to know associated budget on key creation

* fix(key_management_endpoints.py): document 'budget_id' param in `/key/generate`

* docs(key_management_endpoints.py): document budget_id usage

* refactor(budget_management_endpoints.py): refactor budget endpoints into separate file - makes it easier to run documentation testing against it

* docs(test_api_docs.py): add budget endpoints to ci/cd doc test + add missing param info to docs

* fix(customer_endpoints.py): use new pydantic obj name

* docs(user_management_heirarchy.md): add simple doc explaining teams/keys/org/users on litellm

* Litellm dev 12 26 2024 p2 (#7432)

* (Feat) Add logging for `POST v1/fine_tuning/jobs`  (#7426)

* init commit ft jobs logging

* add ft logging

* add logging for FineTuningJob

* simple FT Job create test

* (docs) - show all supported Azure OpenAI endpoints in overview  (#7428)

* azure batches

* update doc

* docs azure endpoints

* docs endpoints on azure

* docs azure batches api

* docs azure batches api

* fix(key_management_endpoints.py): fix key update to actually work

* test(test_key_management.py): add e2e test asserting ui key update call works

* fix: proxy/_types - fix linting erros

* test: update test

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>

* fix: test

* fix(parallel_request_limiter.py): enforce tpm/rpm limits on key from tiers

* fix: fix linting errors

* test: fix test

* fix: remove unused import

* test: update test

* docs(customer_endpoints.py): document new model_max_budget param

* test: specify unique key alias

* docs(budget_management_endpoints.py): document new model_max_budget param

* test: fix test

* test: fix tests

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-12-26 19:05:27 -08:00
Ishaan Jaff
c7f14e936a
(code quality) run ruff rule to ban unused imports (#7313)
* remove unused imports

* fix AmazonConverseConfig

* fix test

* fix import

* ruff check fixes

* test fixes

* fix testing

* fix imports
2024-12-19 12:33:42 -08:00
Ishaan Jaff
6261ec3599
(feat proxy) v2 - model max budgets (#7302)
* clean up unused code

* add _PROXY_VirtualKeyModelMaxBudgetLimiter

* adjust type imports

* working _PROXY_VirtualKeyModelMaxBudgetLimiter

* fix user_api_key_model_max_budget

* fix user_api_key_model_max_budget

* update naming

* update naming

* fix changes to RouterBudgetLimiting

* test_call_with_key_over_model_budget

* test_call_with_key_over_model_budget

* handle _get_request_model_budget_config

* e2e test for test_call_with_key_over_model_budget

* clean up test

* run ci/cd again

* add validate_model_max_budget

* docs fix

* update doc

* add e2e testing for _PROXY_VirtualKeyModelMaxBudgetLimiter

* test_unit_test_max_model_budget_limiter.py
2024-12-18 19:42:46 -08:00
Krish Dholakia
2f08341a08
Litellm dev readd prompt caching (#7299)
* fix(router.py): re-add saving model id on prompt caching valid successful deployment

* fix(router.py): introduce optional pre_call_checks

isolate prompt caching logic in a separate file

* fix(prompt_caching_deployment_check.py): fix import

* fix(router.py): new 'async_filter_deployments' event hook

allows custom logger to filter deployments returned to routing strategy

* feat(prompt_caching_deployment_check.py): initial working commit of prompt caching based routing

* fix(cooldown_callbacks.py): fix linting error

* fix(budget_limiter.py): move budget logger to async_filter_deployment hook

* test: add unit test

* test(test_router_helper_utils.py): add unit testing

* fix(budget_limiter.py): fix linting errors

* docs(config_settings.md): add 'optional_pre_call_checks' to router_settings param docs
2024-12-18 15:13:49 -08:00
Ishaan Jaff
9c30387e66 tag budgets fixes 2024-12-18 10:28:37 -08:00
Ishaan Jaff
7103198805
(feat) Add Tag-based budgets on litellm router / proxy (#7236)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 46s
* add BudgetConfig

* add _get_tags_from_request_kwargs

* test_tag_budgets_e2e_test_expect_to_fail

* add a check for request tags

* fix _async_get_cache_keys_for_router_budget_limiting

* fix test

* fix _sync_in_memory_spend_with_redis

* _async_get_cache_keys_for_router_budget_limiting

* fix _init_tag_budgets

* fix type casting

* docs show error for tag budget limit hit

* fix _get_tags_from_request_kwargs

* fix undo change
2024-12-14 17:28:36 -08:00
Ishaan Jaff
163529b40b
(feat - Router / Proxy ) Allow setting budget limits per LLM deployment (#7220)
* fix test_deployment_budget_limits_e2e_test

* refactor async_log_success_event to track spend for provider + deployment

* fix format

* rename class to RouterBudgetLimiting

* rename func

* rename types used for budgets

* add new types for deployment budgets

* add budget limits for deployments

* fix checking budgets set for provider

* update file names

* fix linting error

* _track_provider_remaining_budget_prometheus

* async_filter_deployments

* fix model list passed to router

* update error

* test_deployment_budgets_e2e_test_expect_to_fail

* fix test case

* run deployment budget limits
2024-12-13 19:15:51 -08:00
Renamed from litellm/router_strategy/provider_budgets.py (Browse further)