Commit graph

1589 commits

Author SHA1 Message Date
Krish Dholakia
fef7839e8a
Litellm dev 01 06 2025 p1 (#7594)
* fix(custom_logger.py): expose new 'async_get_chat_completion_prompt' event hook

* fix(custom_logger.py): langfuse_prompt_management.py

remove 'headers' from custom logger 'async_get_chat_completion_prompt' and 'get_chat_completion_prompt' event hooks

* feat(router.py): expose new function for prompt management based routing

* feat(router.py): partial working router prompt factory logic

allows load balanced model to be used for model name w/ langfuse prompt management call

* feat(router.py): fix prompt management with load balanced model group

* feat(langfuse_prompt_management.py): support reading in openai params from langfuse

enables user to define optional params on langfuse vs. client code

* test(test_Router.py): add unit test for router based langfuse prompt management

* fix: fix linting errors
2025-01-06 21:26:21 -08:00
Krish Dholakia
0c3fef24cd
Litellm dev 01 06 2025 p2 (#7597)
* test(test_amazing_vertex_completion.py): fix test

* test: initial working code gecko test

* fix(vertex_ai_non_gemini.py): support vertex ai code gecko fake streaming

Fixes https://github.com/BerriAI/litellm/issues/7360

* test(test_get_model_info.py): add test for getting custom provider model info

Covers https://github.com/BerriAI/litellm/issues/7575

* fix(utils.py): fix get_provider_model_info check

Handle custom llm provider scenario

Fixes https://github.com/
BerriAI/litellm/issues/7575
2025-01-06 21:04:49 -08:00
Krish Dholakia
b397dc1497
Litellm dev 01 06 2025 p3 (#7596)
* build(model_prices_and_context_window.json): add gemini-1.5-pro 'supports_vision' = true

Fixes https://github.com/BerriAI/litellm/issues/7592

* build(model_prices_and_context_window.json): add new mistral models pricing + model info
2025-01-06 20:44:04 -08:00
Ishaan Jaff
0b5c1392f7
fix _return_user_api_key_auth_obj (#7591) 2025-01-06 16:43:14 -08:00
Krrish Dholakia
23685e93f3 test: skip tests pending vertex credentials
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
2025-01-05 15:29:51 -08:00
Ishaan Jaff
a40baec5ed use latest bucket for testing 2025-01-05 14:55:48 -08:00
Krrish Dholakia
8ae2ca4ed9 test: fix test 2025-01-05 14:52:38 -08:00
Krrish Dholakia
8bda3006fa fix: test 2025-01-05 14:37:17 -08:00
Krrish Dholakia
32538f09fc test: cleanup test 2025-01-05 14:18:29 -08:00
Ishaan Jaff
3110bb0723 use pathrise-convert-1606954137718 2025-01-05 14:14:43 -08:00
Ishaan Jaff
616211daee ci/cd run again 2025-01-05 14:11:27 -08:00
Ishaan Jaff
137879ffea vertex testing use pathrise-convert-1606954137718 2025-01-05 14:00:17 -08:00
Krrish Dholakia
c0e4485fe0 test: update test amazing vertex
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 12s
2025-01-05 13:56:31 -08:00
Ishaan Jaff
ef8812d150 ci/cd update vertex acct 2025-01-05 13:43:32 -08:00
Krish Dholakia
ce97e7e054
fix(groq/chat/transformation.py): fix groq response_format transformation (#7565)
Fixes https://github.com/BerriAI/litellm/issues/4804
2025-01-04 19:39:04 -08:00
Ishaan Jaff
46d9d29bff
(Feat) Hashicorp Secret Manager - Allow storing virtual keys in secret manager (#7549)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* use a base abstract class

* async_write_secret for hcorp

* fix hcorp

* async_write_secret for hashicopr secret manager

* store virtual keys in hcorp

* add delete secret

* test_hashicorp_secret_manager_write_secret

* test_hashicorp_secret_manager_delete_secret

* docs Supported Secret Managers

* docs storing keys in hcorp

* docs hcorp

* docs secret managers

* test_key_generate_with_secret_manager_call

* fix unused imports
2025-01-04 11:35:59 -08:00
Ishaan Jaff
d1b101b9d7
(Fix) - Slack Alerting , don't send duplicate spend report when used on multi instance settings (#7546)
* fix send_weekly_spend_report

* test_spend_report_cache
2025-01-04 10:54:35 -08:00
Krish Dholakia
d43d83f9ef
feat(router.py): support request prioritization for text completion c… (#7540)
* feat(router.py): support request prioritization for text completion calls

* fix(internal_user_endpoints.py): fix sql query to return all keys, including null team id keys on `/user/info`

Fixes https://github.com/BerriAI/litellm/issues/7485

* fix: fix linting errors

* fix: fix linting error

* test(test_router_helper_utils.py): add direct test for '_schedule_factory'

Fixes code qa test
2025-01-03 19:35:44 -08:00
Krish Dholakia
f770dd0c95
Support checking provider-specific /models endpoints for available models based on key (#7538)
* test(test_utils.py): initial test for valid models

Addresses https://github.com/BerriAI/litellm/issues/7525

* fix: test

* feat(fireworks_ai/transformation.py): support retrieving valid models from fireworks ai endpoint

* refactor(fireworks_ai/): support checking model info on `/v1/models` route

* docs(set_keys.md): update docs to clarify check llm provider api usage

* fix(watsonx/common_utils.py): support 'WATSONX_ZENAPIKEY' for iam auth

* fix(watsonx): read in watsonx token from env var

* fix: fix linting errors

* fix(utils.py): fix provider config check

* style: cleanup unused imports
2025-01-03 19:29:59 -08:00
Ishaan Jaff
716efd5fad
(fix proxy perf) use _read_request_body instead of ast.literal_eval to get better performance (#7545)
* fix ast literal eval

* run ci/cd again
2025-01-03 17:48:32 -08:00
Ishaan Jaff
1bb4941036
[Feature]: - allow print alert log to console (#7534)
* update send_to_webhook

* test_print_alerting_payload_warning

* add alerting_args spec

* test_alerting.py
2025-01-03 17:48:13 -08:00
Ishaan Jaff
23104d9a14 test_aiohttp_openai 2025-01-03 15:12:56 -08:00
Krish Dholakia
33f301ec86
Litellm dev 01 02 2025 p1 (#7516)
* fix(redact_messages.py): fix redact messages for non-model response input to be dictionary

fixes issue with otel logging when message redaction is enabled

* fix(proxy_server.py): fix langfuse key leak in exception string

* test: fix test

* test: fix test

* test: fix tests
2025-01-03 14:40:57 -08:00
Ishaan Jaff
fb59f20979
(Feat) - Hashicorp secret manager, use TLS cert authentication (#7532)
* fix - don't print hcorp secrets in debug logs

* hcorp - tls auth fixes

* fix tls_ca_cert_path

* test_hashicorp_secret_manager_tls_cert_auth

* hcp secret docs
2025-01-03 14:23:53 -08:00
Ishaan Jaff
9fef0a6d16
(fix) GCS bucket logger - apply truncate_standard_logging_payload_content to standard_logging_payload and ensure GCS flushes queue on fails (#7519)
* fix async_send_batch for gcs

* fix truncate GCS logger

* test_truncate_standard_logging_payload
2025-01-03 08:09:03 -08:00
Ishaan Jaff
d861aa8ff3
(perf) use aiohttp for custom_openai (#7514)
* use aiohttp handler

* BaseLLMAIOHTTPHandler

* use CustomOpenAIChatConfig

* CustomOpenAIChatConfig

* CustomOpenAIChatConfig

* fix linting

* AiohttpOpenAIChatConfig

* fix order

* aiohttp_openai
2025-01-02 22:15:17 -08:00
Ishaan Jaff
4d93fe787b
Revert "(fix) GCS bucket logger - apply `truncate_standard_logging_payload_co…" (#7515)
This reverts commit 26a37c50c9.
2025-01-02 22:01:02 -08:00
Krish Dholakia
45b93f2721
Litellm dev 01 01 2025 p3 (#7503)
* fix(utils.py): add new validate tool choice helper function

Prevents https://github.com/BerriAI/litellm/issues/7483

* fix(main.py): add tool choice validation on .completion()

prevents user error like - https://github.com/BerriAI/litellm/issues/7483

* fix(utils.py): fix return val of tool choice validation logic
2025-01-01 22:12:15 -08:00
Ishaan Jaff
f96f54a0f5 test_aview_spend_per_user 2025-01-01 21:48:35 -08:00
Ishaan Jaff
8c29489c40 test_aadmin_only_routes 2025-01-01 21:48:20 -08:00
Ishaan Jaff
26a37c50c9
(fix) GCS bucket logger - apply truncate_standard_logging_payload_content to standard_logging_payload and ensure GCS flushes queue on fails (#7500)
* use truncate_standard_logging_payload_content

* update truncate_standard_logging_payload_content

* update dd logger

* update gcs async_send_batch

* fix code check

* test_datadog_payload_content_truncation

* fix code quality
2025-01-01 20:21:01 -08:00
Krish Dholakia
07fc394072
Litellm dev 01 01 2025 p1 (#7498)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 11s
* refactor(prometheus.py): refactor to remove `_tag` metrics and incorporate in regular metrics

* fix(prometheus.py): handle label values not set in enum values

* feat(prometheus.py): working e2e custom metadata labels

* docs(prometheus.md): update docs to clarify how custom metrics would work

* test(test_prometheus_unit_tests.py): fix test

* test: add unit testing
2025-01-01 18:59:28 -08:00
Krish Dholakia
0120176541
Litellm dev 12 30 2024 p2 (#7495)
* test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model

* fix(base_llm_unit_tests.py): handle azure o1 preview response format tests

skip as o1 on azure doesn't support tool calling yet

* fix: initial commit of azure o1 handler using openai caller

simplifies calling + allows fake streaming logic alr. implemented for openai to just work

* feat(azure/o1_handler.py): fake o1 streaming for azure o1 models

azure does not currently support streaming for o1

* feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info

enables user to toggle on when azure allows o1 streaming without needing to bump versions

* style(router.py): remove 'give feedback/get help' messaging when router is used

Prevents noisy messaging

Closes https://github.com/BerriAI/litellm/issues/5942

* fix(types/utils.py): handle none logprobs

Fixes https://github.com/BerriAI/litellm/issues/328

* fix(exception_mapping_utils.py): fix error str unbound error

* refactor(azure_ai/): move to openai_like chat completion handler

allows for easy swapping of api base url's (e.g. ai.services.com)

Fixes https://github.com/BerriAI/litellm/issues/7275

* refactor(azure_ai/): move to base llm http handler

* fix(azure_ai/): handle differing api endpoints

* fix(azure_ai/): make sure all unit tests are passing

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting error

* fix: fix linting errors

* fix(azure_ai/transformation.py): handle extra body param

* fix(azure_ai/transformation.py): fix max retries param handling

* fix: fix test

* test(test_azure_o1.py): fix test

* fix(llm_http_handler.py): support handling azure ai unprocessable entity error

* fix(llm_http_handler.py): handle sync invalid param error for azure ai

* fix(azure_ai/): streaming support with base_llm_http_handler

* fix(llm_http_handler.py): working sync stream calls with unprocessable entity handling for azure ai

* fix: fix linting errors

* fix(llm_http_handler.py): fix linting error

* fix(azure_ai/): handle cohere tool call invalid index param error
2025-01-01 18:57:29 -08:00
Ishaan Jaff
cf60444916
(Feat) Add support for reading secrets from Hashicorp vault (#7497)
* HashicorpSecretManager

* test_hashicorp_secret_managerv

* use 1 helper initialize_secret_manager

* add HASHICORP_VAULT

* working config

* hcorp read_secret

* HashicorpSecretManager

* add secret_manager_testing

* use 1 folder for secret manager testing

* test_hashicorp_secret_manager_get_secret

* HashicorpSecretManager

* docs HCP secrets

* update folder name

* docs hcorp secret manager

* remove unused imports

* add conftest.py

* fix tests

* docs document env vars
2025-01-01 18:35:05 -08:00
Ishaan Jaff
38bfefa6ef
(Feat) - LiteLLM Use UsernamePasswordCredential for Azure OpenAI (#7496)
* add get_azure_ad_token_from_username_password

* docs azure use username / password for auth

* update doc

* get_azure_ad_token_from_username_password

* test test_get_azure_ad_token_from_username_password
2025-01-01 14:11:27 -08:00
Ishaan Jaff
2979b8301c
(feat) POST /fine_tuning/jobs support passing vertex specific hyper params (#7490)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 12s
* update convert_openai_request_to_vertex

* test_create_vertex_fine_tune_jobs_mocked

* fix order of methods

* update LiteLLMFineTuningJobCreate

* update OpenAIFineTuningHyperparameters

* update vertex hyper params in response

* _transform_openai_hyperparameters_to_vertex_hyperparameters

* supervised_tuning_spec["hyperParameters"] fix

* fix mapping for ft params testing

* docs fine tuning apis

* fix test_convert_basic_openai_request_to_vertex_request

* update hyperparams for create fine tuning

* fix linting

* test_create_vertex_fine_tune_jobs_mocked_with_hyperparameters

* run ci/cd again

* test_convert_basic_openai_request_to_vertex_request
2025-01-01 07:44:48 -08:00
Krish Dholakia
d984a9281a
Prometheus - custom metrics support + other improvements (#7489)
* fix(prometheus.py): refactor litellm_input_tokens_metric to use label factory

makes adding new metrics easier

* feat(prometheus.py): add 'request_model' to 'litellm_input_tokens_metric'

* refactor(prometheus.py): refactor 'litellm_output_tokens_metric' to use label factory

makes adding new metrics easier

* feat(prometheus.py): emit requested model in 'litellm_output_tokens_metric'

* feat(prometheus.py): support tracking success events with custom metrics

* refactor(prometheus.py): refactor '_set_latency_metrics' to just use the initially created enum values dictionary

reduces scope for missing values

* feat(prometheus.py): refactor all tags to support custom metadata tags

enables metadata tags to be used across for e2e tracking

* fix(prometheus.py): fix requested model on success event enum_values

* test: fix test

* test: fix test

* test: handle filenotfound error

* docs(prometheus.md): add new values to prometheus

* docs(prometheus.md): document adding custom metrics on prometheus

* bump: version 1.56.5 → 1.56.6
2025-01-01 07:41:50 -08:00
Ishaan Jaff
03b1db5a7d
(Feat) - Add PagerDuty Alerting Integration (#7478)
* define basic types

* fix verbose_logger.exception statement

* fix basic alerting

* test pager duty alerting

* test_pagerduty_alerting_high_failure_rate

* PagerDutyAlerting

* async_log_failure_event

* use pre_call_hook

* add _request_is_completed helper util

* update AlertingConfig

* rename PagerDutyInternalEvent

* _send_alert_if_thresholds_crossed

* use pagerduty as _custom_logger_compatible_callbacks_literal

* fix slack alerting imports

* fix imports in slack alerting

* PagerDutyAlerting

* fix _load_alerting_settings

* test_pagerduty_hanging_request_alerting

* working pager duty alerting

* fix linting

* doc pager duty alerting

* update hanging_response_handler

* fix import location

* update failure_threshold

* update async_pre_call_hook

* docs pagerduty

* test - callback_class_str_to_classType

* fix linting errors

* fix linting + testing error

* PagerDutyAlerting

* test_pagerduty_hanging_request_alerting

* fix unused imports

* docs pager duty

* @pytest.mark.flaky(retries=6, delay=2)

* test_model_info_bedrock_converse_enforcement
2025-01-01 07:12:51 -08:00
Krish Dholakia
39cbd9d878
Litellm dev 12 31 2024 p1 (#7488)
* fix(internal_user_endpoints.py): fix team list sort - handle team_alias being set + None

* fix(key_management_endpoints.py): allow team admin to create key for member via admin ui

Fixes https://github.com/BerriAI/litellm/issues/7482

* fix(proxy_server.py): allow querying info on specific model group via `/model_group/info`

allows client-side user to get model info from proxy

* fix(proxy_server.py): add docstring on `/model_group/info` showing how to filter by model name

* test(test_proxy_utils.py): add unit test for returning model group info filtered

* fix(proxy_server.py): fix query param

* fix(test_Get_model_info.py): handle no whitelisted bedrock modells
2024-12-31 23:21:51 -08:00
Krish Dholakia
080de89cfb
Fix team-based logging to langfuse + allow custom tokenizer on /token_counter endpoint (#7493)
* fix(langfuse_prompt_management.py): migrate dynamic logging to langfuse custom logger compatible class

* fix(langfuse_prompt_management.py): support failure callback logging to langfuse as well

* feat(proxy_server.py): support setting custom tokenizer on config.yaml

Allows customizing value for `/utils/token_counter`

* fix(proxy_server.py): fix linting errors

* test: skip if file not found

* style: cleanup unused import

* docs(configs.md): add docs on setting custom tokenizer
2024-12-31 23:18:41 -08:00
Ishaan Jaff
859f6e1635
(fix) v1/fine_tuning/jobs with VertexAI (#7487)
* update convert_openai_request_to_vertex

* test_create_vertex_fine_tune_jobs_mocked
2024-12-31 15:09:56 -08:00
Krish Dholakia
41e5b3aa8d
HumanLoop integration for Prompt Management (#7479)
* feat(humanloop.py): initial commit for humanloop prompt management integration

Closes https://github.com/BerriAI/litellm/issues/213

* feat(humanloop.py): working e2e humanloop prompt management integration

Closes https://github.com/BerriAI/litellm/issues/213

* fix(humanloop.py): fix linting errors

* fix: fix linting erro

* fix: fix test

* test: handle filenotfound error
2024-12-30 22:26:03 -08:00
Krish Dholakia
347779b813
Litellm dev 12 30 2024 p1 (#7480)
* test(azure_openai_o1.py): initial commit with testing for azure openai o1 preview model

* fix(base_llm_unit_tests.py): handle azure o1 preview response format tests

skip as o1 on azure doesn't support tool calling yet

* fix: initial commit of azure o1 handler using openai caller

simplifies calling + allows fake streaming logic alr. implemented for openai to just work

* feat(azure/o1_handler.py): fake o1 streaming for azure o1 models

azure does not currently support streaming for o1

* feat(o1_transformation.py): support overriding 'should_fake_stream' on azure/o1 via 'supports_native_streaming' param on model info

enables user to toggle on when azure allows o1 streaming without needing to bump versions

* style(router.py): remove 'give feedback/get help' messaging when router is used

Prevents noisy messaging

Closes https://github.com/BerriAI/litellm/issues/5942

* test: fix azure o1 test

* test: fix tests

* fix: fix test
2024-12-30 21:52:52 -08:00
Ishaan Jaff
83879d2a3d
test_rerank_response_assertions (#7476)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 42s
2024-12-30 10:12:56 -08:00
Ishaan Jaff
a003af6c04
(fix) litellm.amoderation - support using model=openai/omni-moderation-latest, model=omni-moderation-latest, model=None (#7475)
* test_moderation_endpoint

* fix litellm.amoderation
2024-12-30 09:42:51 -08:00
Krish Dholakia
31ace870a2
Litellm dev 12 28 2024 p1 (#7463)
* refactor(utils.py): migrate amazon titan config to base config

* refactor(utils.py): refactor bedrock meta invoke model translation to use base config

* refactor(utils.py): move bedrock ai21 to base config

* refactor(utils.py): move bedrock cohere to base config

* refactor(utils.py): move bedrock mistral to use base config

* refactor(utils.py): move all provider optional param translations to using a config

* docs(clientside_auth.md): clarify how to pass vertex region to litellm proxy

* fix(utils.py): handle scenario where custom llm provider is none / empty

* fix: fix get config

* test(test_otel_load_tests.py): widen perf margin

* fix(utils.py): fix get provider config check to handle custom llm's

* fix(utils.py): fix check
2024-12-28 20:26:00 -08:00
Ishaan Jaff
ea8f0913c2 test_e2e_batches_files 2024-12-28 19:54:04 -08:00
Ishaan Jaff
32e8bdef6f update clean up jobs 2024-12-28 19:45:19 -08:00
Krish Dholakia
cfb6890b9f
Litellm dev 12 28 2024 p2 (#7458)
* docs(sidebar.js): docs for support model access groups for wildcard routes

* feat(key_management_endpoints.py): add check if user is premium_user when adding model access group for wildcard route

* refactor(docs/): make control model access a root-level doc in proxy sidebar

easier to discover how to control model access on litellm

* docs: more cleanup

* feat(fireworks_ai/): add document inlining support

Enables user to call non-vision models with images/pdfs/etc.

* test(test_fireworks_ai_translation.py): add unit testing for fireworks ai transform inline helper util

* docs(docs/): add document inlining details to fireworks ai docs

* feat(fireworks_ai/): allow user to dynamically disable auto add transform inline

allows client-side disabling of this feature for proxy users

* feat(fireworks_ai/): return 'supports_vision' and 'supports_pdf_input' true on all fireworks ai models

now true as fireworks ai supports document inlining

* test: fix tests

* fix(router.py): add unit testing for _is_model_access_group_for_wildcard_route
2024-12-28 19:38:06 -08:00
Ishaan Jaff
3eb962c594 update - new test for test_text_completion_health_check 2024-12-28 19:36:23 -08:00