Commit graph

929 commits

Author SHA1 Message Date
Krish Dholakia
41e5b3aa8d
HumanLoop integration for Prompt Management (#7479)
* feat(humanloop.py): initial commit for humanloop prompt management integration

Closes https://github.com/BerriAI/litellm/issues/213

* feat(humanloop.py): working e2e humanloop prompt management integration

Closes https://github.com/BerriAI/litellm/issues/213

* fix(humanloop.py): fix linting errors

* fix: fix linting erro

* fix: fix test

* test: handle filenotfound error
2024-12-30 22:26:03 -08:00
Krish Dholakia
67b39bacf7
LiteLLM Minor Fixes & Improvements (12/27/2024) - p1 (#7448)
* feat(main.py): mock_response() - support 'litellm.ContextWindowExceededError' in mock response

enabled quicker router/fallback/proxy debug on context window errors

* feat(exception_mapping_utils.py): extract special litellm errors from error str if calling `litellm_proxy/` as provider

Closes https://github.com/BerriAI/litellm/issues/7259

* fix(user_api_key_auth.py): specify 'Received Proxy Server Request' is span kind server

Closes https://github.com/BerriAI/litellm/issues/7298
2024-12-27 19:04:39 -08:00
Ishaan Jaff
62753eea69
(Feat) Log Guardrails run, guardrail response on logging integrations (#7445)
* add guardrail_information to SLP

* use standard_logging_guardrail_information

* track StandardLoggingGuardrailInformation

* use log_guardrail_information

* use log_guardrail_information

* docs guardrails

* docs guardrails

* update quick start

* fix presidio logging for sync functions

* update Guardrail type

* enforce add_standard_logging_guardrail_information_to_request_data

* update gd docs
2024-12-27 15:01:56 -08:00
Krish Dholakia
9d82ff4793
Litellm dev 12 26 2024 p3 (#7434)
* build(model_prices_and_context_window.json): update groq models to specify 'supports_vision' parameter

Closes https://github.com/BerriAI/litellm/issues/7433

* docs(groq.md): add groq vision example to docs

Closes https://github.com/BerriAI/litellm/issues/7433

* fix(prometheus.py): refactor self.litellm_proxy_failed_requests_metric to use label factory

* feat(prometheus.py): new 'litellm_proxy_failed_requests_by_tag_metric'

allows tracking failed requests by tag on proxy

* fix(prometheus.py): fix exception logging

* feat(prometheus.py): add new 'litellm_request_total_latency_by_tag_metric'

enables tracking latency by use-case

* feat(prometheus.py): add new llm api latency by tag metric

* feat(prometheus.py): new litellm_deployment_latency_per_output_token_by_tag metric

allows tracking deployment latency by tag

* fix(prometheus.py): refactor 'litellm_requests_metric' to use enum values + label factory

* feat(prometheus.py): new litellm_proxy_total_requests_by_tag metric

allows tracking total requests by tag

* feat(prometheus.py): new metric litellm_deployment_successful_fallbacks_by_tag

allows tracking deployment fallbacks by tag

* fix(prometheus.py): new 'litellm_deployment_failed_fallbacks_by_tag' metric

allows tracking failed fallbacks on deployment by custom tag

* test: fix test

* test: rename test to run earlier

* test: skip flaky test
2024-12-26 21:21:16 -08:00
Ishaan Jaff
17d5ff2fa4
(fix) initializing OTEL Logging on LiteLLM Proxy - ensure OTEL logger is initialized only once (#7435)
* add otel to _custom_logger_compatible_callbacks_literal

* remove extra code

* fix _get_custom_logger_settings_from_proxy_server

* update unit tests
2024-12-26 21:17:19 -08:00
Krish Dholakia
539f166166
Support budget/rate limit tiers for keys (#7429)
* feat(proxy/utils.py): get associated litellm budget from db in combined_view for key

allows user to create rate limit tiers and associate those to keys

* feat(proxy/_types.py): update the value of key-level tpm/rpm/model max budget metrics with the associated budget table values if set

allows rate limit tiers to be easily applied to keys

* docs(rate_limit_tiers.md): add doc on setting rate limit / budget tiers

make feature discoverable

* feat(key_management_endpoints.py): return litellm_budget_table value in key generate

make it easy for user to know associated budget on key creation

* fix(key_management_endpoints.py): document 'budget_id' param in `/key/generate`

* docs(key_management_endpoints.py): document budget_id usage

* refactor(budget_management_endpoints.py): refactor budget endpoints into separate file - makes it easier to run documentation testing against it

* docs(test_api_docs.py): add budget endpoints to ci/cd doc test + add missing param info to docs

* fix(customer_endpoints.py): use new pydantic obj name

* docs(user_management_heirarchy.md): add simple doc explaining teams/keys/org/users on litellm

* Litellm dev 12 26 2024 p2 (#7432)

* (Feat) Add logging for `POST v1/fine_tuning/jobs`  (#7426)

* init commit ft jobs logging

* add ft logging

* add logging for FineTuningJob

* simple FT Job create test

* (docs) - show all supported Azure OpenAI endpoints in overview  (#7428)

* azure batches

* update doc

* docs azure endpoints

* docs endpoints on azure

* docs azure batches api

* docs azure batches api

* fix(key_management_endpoints.py): fix key update to actually work

* test(test_key_management.py): add e2e test asserting ui key update call works

* fix: proxy/_types - fix linting erros

* test: update test

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>

* fix: test

* fix(parallel_request_limiter.py): enforce tpm/rpm limits on key from tiers

* fix: fix linting errors

* test: fix test

* fix: remove unused import

* test: update test

* docs(customer_endpoints.py): document new model_max_budget param

* test: specify unique key alias

* docs(budget_management_endpoints.py): document new model_max_budget param

* test: fix test

* test: fix tests

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
2024-12-26 19:05:27 -08:00
Krish Dholakia
21e8f212d7
Litellm dev 12 25 2024 p3 (#7421)
* refactor(prometheus.py): refactor to use a factory method for setting label values

allows for enforcing end user id disabling on prometheus e2e

* fix: fix linting error

* fix(prometheus.py): ensure label factory drops end-user value if disabled by user

* fix(prometheus.py): specify service_type in end user tracking get

* test: fix test

* test: add unit test for prometheus factory

* test: improve test (cover flag not set scenario)

* test(test_prometheus.py): e2e test covering if 'end_user_id' shows up in testing if disabled

scrapes the `/metrics` endpoint and scans text to check if id appears in emitted metrics

* fix(prometheus.py): stringify status code before logging it
2024-12-25 18:54:24 -08:00
Ishaan Jaff
0ce5f9fe58
(feat) Support Dynamic Params for guardrails (#7415)
* update CustomGuardrail

* unit test custom guardrails

* add dynamic params for aporia

* add dynamic params to bedrock guard

* add dynamic params for all guardrails

* fix linting

* fix should_run_guardrail

* _validate_premium_user

* update guardrail doc

* doc update

* update code q

* should_run_guardrail
2024-12-25 16:07:29 -08:00
Krish Dholakia
2e86a4806d
Litellm dev 12 24 2024 p2 (#7400)
* fix(utils.py): default custom_llm_provider=None for 'supports_response_schema'

Closes https://github.com/BerriAI/litellm/issues/7397

* refactor(langfuse/): call langfuse logger inside customlogger compatible langfuse class, refactor langfuse logger to use verbose_logger.debug instead of print_verbose

* refactor(litellm_pre_call_utils.py): move config based team callbacks inside dynamic team callback logic

enables simpler unit testing for config-based team callbacks

* fix(proxy/_types.py): handle teamcallbackmetadata - none values

drop none values if present. if all none, use default dict to avoid downstream errors

* test(test_proxy_utils.py): add unit test preventing future issues - asserts team_id in config state not popped off across calls

Fixes https://github.com/BerriAI/litellm/issues/6787

* fix(langfuse_prompt_management.py): add success + failure logging event support

* fix: fix linting error

* test: fix test

* test: fix test

* test: override o1 prompt caching - openai currently not working

* test: fix test
2024-12-24 20:33:41 -08:00
Krish Dholakia
39dabb2e89
Litellm dev 12 24 2024 p4 (#7407)
* fix(invoke_handler.py): fix mock response iterator to handle tool calling

returns tool call if returned by model response

* fix(prometheus.py): add new 'tokens_by_tag' metric on prometheus

allows tracking 'token usage' by task

* feat(prometheus.py): add input + output token tracking by tag

* feat(prometheus.py): add tag based deployment failure tracking

allows admin to track failure by use-case
2024-12-24 20:24:06 -08:00
Krish Dholakia
78fe124c14
Add 'end_user', 'user' and 'requested_model' on more prometheus metrics (#7399)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 11s
* fix(prometheus.py): support streaming end user litellm_proxy_total_requests_metric tracking

* fix(prometheus.py): add 'requested_model' and 'end_user_id' to 'litellm_request_total_latency_metric_bucket'

enables latency tracking by end user + requested model

* fix(prometheus.py): add end user, user and requested model metrics to 'litellm_llm_api_latency_metric'

* test: update prometheus unit tests

* test(test_prometheus.py): update tests

* test(test_prometheus.py): fix test

* test: reorder test
2024-12-24 14:08:30 -08:00
Ishaan Jaff
8aaa0b44c4
dd logger fix - handle objects that can't be JSON dumped (#7393)
* dd logger fix - handle objects that can't be dumped

* test_datadog_non_serializable_messages
2024-12-23 18:21:49 -08:00
Krish Dholakia
522da384b6
Litellm dev 12 20 2024 p3 (#7339)
* fix(proxy_track_cost_callback.py): log to db if only end user param given

* fix: allows for jwt-auth based end user id spend tracking to work

* fix(utils.py): fix 'get_end_user_id_for_cost_tracking' to use 'user_api_key_end_user_id'

more stable - works with jwt-auth based end user tracking as well

* test(test_jwt.py): add e2e unit test to confirm end user cost tracking works for spend logs

* test: update test to use end_user api key hash param

* fix(langfuse.py): support end user cost tracking via jwt auth + langfuse

logs end user to langfuse if decoded from jwt token

* fix: fix linting errors

* test: fix test

* test: fix test

* fix: fix end user id extraction

* fix: run test earlier
2024-12-20 21:13:32 -08:00
Krish Dholakia
27a4d08604
Litellm dev 2024 12 19 p3 (#7322)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 13s
* fix(utils.py): remove unsupported optional params (if drop_params=True) before passing into map openai params

Fixes https://github.com/BerriAI/litellm/issues/7242

* test: new test for langfuse prompt management hook

Addresses https://github.com/BerriAI/litellm/issues/3893#issuecomment-2549080296

* feat(main.py): add 'get_chat_completion_prompt' customlogger hook

allows for langfuse prompt management

Addresses https://github.com/BerriAI/litellm/issues/3893#issuecomment-2549080296

* feat(langfuse_prompt_management.py): working e2e langfuse prompt management

works with `langfuse/` route

* feat(main.py): initial tracing for dynamic langfuse params

allows admin to specify langfuse keys by model in model_list

* feat(main.py): support passing langfuse credentials dynamically

* fix(langfuse_prompt_management.py): create langfuse client based on dynamic callback params

allows dynamic langfuse params to work

* fix: fix linting errors

* docs(prompt_management.md): refactor docs for sdk + proxy prompt management tutorial

* docs(prompt_management.md): cleanup doc

* docs: cleanup topnav

* docs(prompt_management.md): update docs to be easier to use

* fix: remove unused imports

* docs(prompt_management.md): add architectural overview doc

* fix(litellm_logging.py): fix dynamic param passing

* fix(langfuse_prompt_management.py): fix linting errors

* fix: fix linting errors

* fix: use typing_extensions for typealias to ensure python3.8 compatibility

* test: use stream_options in test to account for tiktoken diff

* fix: improve import error message, and check run test earlier
2024-12-20 13:30:16 -08:00
Ishaan Jaff
c7f14e936a
(code quality) run ruff rule to ban unused imports (#7313)
* remove unused imports

* fix AmazonConverseConfig

* fix test

* fix import

* ruff check fixes

* test fixes

* fix testing

* fix imports
2024-12-19 12:33:42 -08:00
Ishaan Jaff
246e3bafc8
(feat - proxy) Add status_code to litellm_proxy_total_requests_metric_total (#7293)
* fix _select_model_name_for_cost_calc docstring

* add STATUS_CODE  to prometheus

* test prometheus unit tests

* test_prometheus_unit_tests.py

* update Proxy Level Tracking Metrics docs

* fix test_proxy_failure_metrics

* fix test_proxy_failure_metrics
2024-12-18 15:55:02 -08:00
Krish Dholakia
2f08341a08
Litellm dev readd prompt caching (#7299)
* fix(router.py): re-add saving model id on prompt caching valid successful deployment

* fix(router.py): introduce optional pre_call_checks

isolate prompt caching logic in a separate file

* fix(prompt_caching_deployment_check.py): fix import

* fix(router.py): new 'async_filter_deployments' event hook

allows custom logger to filter deployments returned to routing strategy

* feat(prompt_caching_deployment_check.py): initial working commit of prompt caching based routing

* fix(cooldown_callbacks.py): fix linting error

* fix(budget_limiter.py): move budget logger to async_filter_deployment hook

* test: add unit test

* test(test_router_helper_utils.py): add unit testing

* fix(budget_limiter.py): fix linting errors

* docs(config_settings.md): add 'optional_pre_call_checks' to router_settings param docs
2024-12-18 15:13:49 -08:00
Ishaan Jaff
f3c546b79e
(feat) proxy Azure Blob Storage - Add support for AZURE_STORAGE_ACCOUNT_KEY Auth (#7280)
* add upload_to_azure_data_lake_with_azure_account_key

* async_upload_payload_to_azure_blob_storage

* docs add AZURE_STORAGE_ACCOUNT_KEY

* add azure-storage-file-datalake
2024-12-17 17:35:45 -08:00
Krish Dholakia
019ddc32d6
Litellm dev 12 17 2024 p2 (#7277)
* fix(openai/transcription/handler.py): call 'log_pre_api_call' on async calls

* fix(openai/transcriptions/handler.py): call 'logging.pre_call' on sync whisper calls as well

* fix(proxy_cli.py): remove default proxy_cli timeout param

gets passed in as a dynamic request timeout and overrides config values

* fix(langfuse.py): pass litellm httpx client - contains ssl certs (#7052)

Fixes https://github.com/BerriAI/litellm/issues/7046
2024-12-17 14:05:14 -08:00
Krish Dholakia
b82add11ba
LITELLM: Remove requests library usage (#7235)
* fix(generic_api_callback.py): remove requests lib usage

* fix(budget_manager.py): remove requests lib usgae

* fix(main.py): cleanup requests lib usage

* fix(utils.py): remove requests lib usage

* fix(argilla.py): fix argilla test

* fix(athina.py): replace 'requests' lib usage with litellm module

* fix(greenscale.py): replace 'requests' lib usage with httpx

* fix: remove unused 'requests' lib import + replace usage in some places

* fix(prompt_layer.py): remove 'requests' lib usage from prompt layer

* fix(ollama_chat.py): remove 'requests' lib usage

* fix(baseten.py): replace 'requests' lib usage

* fix(codestral/): replace 'requests' lib usage

* fix(predibase/): replace 'requests' lib usage

* refactor: cleanup unused 'requests' lib imports

* fix(oobabooga.py): cleanup 'requests' lib usage

* fix(invoke_handler.py): remove unused 'requests' lib usage

* refactor: cleanup unused 'requests' lib import

* fix: fix linting errors

* refactor(ollama/): move ollama to using base llm http handler

removes 'requests' lib dep for ollama integration

* fix(ollama_chat.py): fix linting errors

* fix(ollama/completion/transformation.py): convert non-jpeg/png image to jpeg/png before passing to ollama
2024-12-17 12:50:04 -08:00
Ishaan Jaff
3c984ed60e
(feat) Add Azure Blob Storage Logging Integration (#7265)
* add path to http handler

* AzureBlobStorageLogger

* test_azure_blob_storage

* use constants for Azure storage

* use helper get_azure_ad_token_from_entrata_id

* azure blob storage support

* get_azure_ad_token_from_azure_storage

* fix import

* azure logging

* docs azure storage

* add docs on azure blobs

* add premium user check

* add azure_storage  as identified logging callback

* async_upload_payload_to_azure_blob_storage

* docs azure storage

* callback_class_str_to_classType
2024-12-16 22:18:22 -08:00
Krish Dholakia
516c2a6a70
Litellm remove circular imports (#7232)
* fix(utils.py): initial commit to remove circular imports - moves llmproviders to utils.py

* fix(router.py): fix 'litellm.EmbeddingResponse' import from router.py

'

* refactor: fix litellm.ModelResponse import on pass through endpoints

* refactor(litellm_logging.py): fix circular import for custom callbacks literal

* fix(factory.py): fix circular imports inside prompt factory

* fix(cost_calculator.py): fix circular import for 'litellm.Usage'

* fix(proxy_server.py): fix potential circular import with `litellm.Router'

* fix(proxy/utils.py): fix potential circular import in `litellm.Router`

* fix: remove circular imports in 'auth_checks' and 'guardrails/'

* fix(prompt_injection_detection.py): fix router impor t

* fix(vertex_passthrough_logging_handler.py): fix potential circular imports in vertex pass through

* fix(anthropic_pass_through_logging_handler.py): fix potential circular imports

* fix(slack_alerting.py-+-ollama_chat.py): fix modelresponse import

* fix(base.py): fix potential circular import

* fix(handler.py): fix potential circular ref in codestral + cohere handler's

* fix(azure.py): fix potential circular imports

* fix(gpt_transformation.py): fix modelresponse import

* fix(litellm_logging.py): add logging base class - simplify typing

makes it easy for other files to type check the logging obj without introducing circular imports

* fix(azure_ai/embed): fix potential circular import on handler.py

* fix(databricks/): fix potential circular imports in databricks/

* fix(vertex_ai/): fix potential circular imports on vertex ai embeddings

* fix(vertex_ai/image_gen): fix import

* fix(watsonx-+-bedrock): cleanup imports

* refactor(anthropic-pass-through-+-petals): cleanup imports

* refactor(huggingface/): cleanup imports

* fix(ollama-+-clarifai): cleanup circular imports

* fix(openai_like/): fix impor t

* fix(openai_like/): fix embedding handler

cleanup imports

* refactor(openai.py): cleanup imports

* fix(sagemaker/transformation.py): fix import

* ci(config.yml): add circular import test to ci/cd
2024-12-14 16:28:34 -08:00
Ishaan Jaff
b45777c268
(Feat) DataDog Logger - Add HOSTNAME and POD_NAME to DataDog logs (#7189)
* add unit test for test_datadog_static_methods

* docs dd vars

* test_datadog_payload_environment_variables

* test_datadog_static_methods

* docs env vars

* fix table
2024-12-12 12:06:26 -08:00
Ishaan Jaff
21003c4337
Code Quality Improvement - use vertex_ai/ as folder name for vertexAI (#7166)
* fix rename vertex ai

* run ci/cd again
2024-12-11 00:32:41 -08:00
Ishaan Jaff
b5d55688e5
(Refactor) Code Quality improvement - remove /prompt_templates/ , base_aws_llm.py from /llms folder (#7164)
* fix move base_aws_llm

* fix import

* update enforce llms folder style

* move prompt_templates

* update prompt_templates location

* fix imports

* fix imports

* fix imports

* fix imports

* fix checks
2024-12-11 00:02:46 -08:00
Krish Dholakia
5bbf906c83
Litellm code qa common config (#7113)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 44s
* feat(base_llm): initial commit for common base config class

Addresses code qa critique https://github.com/andrewyng/aisuite/issues/113#issuecomment-2512369132

* feat(base_llm/): add transform request/response abstract methods to base config class

* feat(cohere-+-clarifai): refactor integrations to use common base config class

* fix: fix linting errors

* refactor(anthropic/): move anthropic + vertex anthropic to use base config

* test: fix xai test

* test: fix tests

* fix: fix linting errors

* test: comment out WIP test

* fix(transformation.py): fix is pdf used check

* fix: fix linting error
2024-12-09 15:58:25 -08:00
Krish Dholakia
19a4273fda
feat(langfuse/): support langfuse prompt management (#7073)
* feat(langfuse/): support langfuse prompt management

Initial working commit for langfuse prompt management support

Closes https://github.com/BerriAI/litellm/issues/6269

* test: update test

* fix(litellm_logging.py): suppress linting error
2024-12-06 23:10:22 -08:00
Krish Dholakia
e4493248ae
Litellm dev 12 06 2024 (#7067)
* fix(edit_budget_modal.tsx): call `/budget/update` endpoint instead of `/budget/new`

allows updating existing budget on ui

* fix(user_api_key_auth.py): support cost tracking for end user via jwt field

* fix(presidio.py): support pii masking on sync logging callbacks

enables masking before logging to langfuse

* feat(utils.py): support retry policy logic inside '.completion()'

Fixes https://github.com/BerriAI/litellm/issues/6623

* fix(utils.py): support retry by retry policy on async logic as well

* fix(handle_jwt.py): set leeway default leeway value

* test: fix test to handle jwt audience claim
2024-12-06 22:44:18 -08:00
Krish Dholakia
816f0ef8d2
LiteLLM Minor Fixes & Improvements (12/05/2024) (#7051)
* fix(cost_calculator.py): move to using `.get_model_info()` for cost per token calculations

ensures cost tracking is reliable - handles edge cases of parsing model cost map

* build(model_prices_and_context_window.json): add 'supports_response_schema' for select tgai models

Fixes https://github.com/BerriAI/litellm/pull/7037#discussion_r1872157329

* build(model_prices_and_context_window.json): remove 'pdf input' and 'vision' support from nova micro in model map

Bedrock docs indicate no support for micro - https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-supported-models-features.html

* fix(converse_transformation.py): support amazon nova tool use

* fix(opentelemetry): Add missing LLM request type attribute to spans (#7041)

* feat(opentelemetry): add LLM request type attribute to spans

* lint

* fix: curl usage (#7038)

curl -d, --data <data> is lowercase d
curl -D, --dump-header <filename> is uppercase D

references:
https://curl.se/docs/manpage.html#-d
https://curl.se/docs/manpage.html#-D

* fix(spend_tracking.py): handle empty 'id' in model response - when creating spend log

Fixes https://github.com/BerriAI/litellm/issues/7023

* fix(streaming_chunk_builder.py): handle initial id being empty string

Fixes https://github.com/BerriAI/litellm/issues/7023

* fix(anthropic_passthrough_logging_handler.py): add end user cost tracking for anthropic pass through endpoint

* docs(pass_through/): refactor docs location + add table on supported features for pass through endpoints

* feat(anthropic_passthrough_logging_handler.py): support end user cost tracking via anthropic sdk

* docs(anthropic_completion.md): add docs on passing end user param for cost tracking on anthropic sdk

* fix(litellm_logging.py): use standard logging payload if present in kwargs

prevent datadog logging error for pass through endpoints

* docs(bedrock.md): add rerank api usage example to docs

* bugfix/change dummy tool name format (#7053)

* fix viewing keys (#7042)

* ui new build

* build(model_prices_and_context_window.json): add bedrock region models to model cost map (#7044)

* bye (#6982)

* (fix) litellm router.aspeech  (#6962)

* doc Migrating Databases

* fix aspeech on router

* test_audio_speech_router

* test_audio_speech_router

* docs show supported providers on batches api doc

* change dummy tool name format

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>

* fix: fix linting errors

* test: update test

* fix(litellm_logging.py): fix pass through check

* fix(test_otel_logging.py): fix test

* fix(cost_calculator.py): update handling for cost per second

* fix(cost_calculator.py): fix cost check

* test: fix test

* (fix) adding public routes when using custom header  (#7045)

* get_api_key_from_custom_header

* add test_get_api_key_from_custom_header

* fix testing use 1 file for test user api key auth

* fix test user api key auth

* test_custom_api_key_header_name

* build: update ui build

---------

Co-authored-by: Doron Kopit <83537683+doronkopit5@users.noreply.github.com>
Co-authored-by: lloydchang <lloydchang@gmail.com>
Co-authored-by: hgulersen <haymigulersen@gmail.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: yujonglee <yujonglee.dev@gmail.com>
2024-12-06 14:29:53 -08:00
Ishaan Jaff
84db69d4c4
(feat) add Vertex Batches API support in OpenAI format (#7032)
* working request

* working transform

* working request

* transform vertex batch response

* add _async_create_batch

* move gcs functions to base

* fix _get_content_from_openai_file

* transform_openai_file_content_to_vertex_ai_file_content

* fix transform vertex gcs bucket upload to OAI files format

* working e2e test

* _get_gcs_object_name

* fix linting

* add doc string

* fix transform_gcs_bucket_response_to_openai_file_object

* use vertex for batch endpoints

* add batches support for vertex

* test_vertex_batches_endpoint

* test_vertex_batch_prediction

* fix gcs bucket base auth

* docs clean up batches

* docs Batch API

* docs vertex batches api

* test_get_gcs_logging_config_without_service_account

* undo change

* fix vertex md

* test_get_gcs_logging_config_without_service_account

* ci/cd run again
2024-12-04 19:40:28 -08:00
Krish Dholakia
6bb934c0ac
fix(key_management_endpoints.py): override metadata field value on up… (#7008)
* fix(key_management_endpoints.py): override metadata field value on update

allow user to override tags

* feat(__init__.py): expose new disable_end_user_cost_tracking_prometheus_only metric

allow disabling end user cost tracking on prometheus - fixes cardinality issue

* fix(litellm_pre_call_utils.py): add key/team level enforced params

Fixes https://github.com/BerriAI/litellm/issues/6652

* fix(key_management_endpoints.py): allow user to pass in `enforced_params` as a top level param on /key/generate and /key/update

* docs(enterprise.md): add docs on enforcing required params for llm requests

* Add support of Galadriel API (#7005)

* fix(router.py): robust retry after handling

set retry after time to 0 if >0 healthy deployments. handle base case = 1 deployment

* test(test_router.py): fix test

* feat(bedrock/): add support for 'nova' models

also adds explicit 'converse/' route for simpler routing

* fix: fix 'supports_pdf_input'

return if model supports pdf input on get_model_info

* feat(converse_transformation.py): support bedrock pdf input

* docs(document_understanding.md): add document understanding to docs

* fix(litellm_pre_call_utils.py): fix linting error

* fix(init.py): fix passing of bedrock converse models

* feat(bedrock/converse): support 'response_format={"type": "json_object"}'

* fix(converse_handler.py): fix linting error

* fix(base_llm_unit_tests.py): fix test

* fix: fix test

* test: fix test

* test: fix test

* test: remove duplicate test

---------

Co-authored-by: h4n0 <4738254+h4n0@users.noreply.github.com>
2024-12-03 23:03:50 -08:00
Ishaan Jaff
67fe16f929
fix - data dog (#7013) 2024-12-03 16:43:23 -08:00
Ishaan Jaff
2d10f48c43
(fixes) datadog logging - handle 1MB max log size on DD (#6996)
* fix dd truncate_standard_logging_payload_content

* dd truncate_standard_logging_payload_content

* fix test_datadog_payload_content_truncation

* add clear msg on _truncate_text

* test_truncate_standard_logging_payload

* fix linting error

* fix linting errors
2024-12-02 23:01:42 -08:00
Ishaan Jaff
9617e7433d
(fix) logging Auth errors on datadog (#6995)
* fix get_standard_logging_object_payload

* fix async_post_call_failure_hook

* fix post_call_failure_hook

* fix change

* fix _is_proxy_only_error

* fix async_post_call_failure_hook

* fix getting request body

* remove redundant code

* use a well named original function name for auth errors

* fix logging auth fails on DD

* fix using request body

* use helper for _handle_logging_proxy_only_error
2024-12-02 23:01:21 -08:00
Krish Dholakia
1c8438d475
Litellm dev 11 30 2024 (#6974)
All checks were successful
Read Version from pyproject.toml / read-version (push) Successful in 11s
* feat(cohere/chat.py): return citations in model response

Closes https://github.com/BerriAI/litellm/issues/6814

* fix(cohere/chat.py): fix linting errors

* fix(langsmith.py): support 'run_id' for langsmith

Fixes https://github.com/BerriAI/litellm/issues/6862

* fix(langsmith.py): fix langsmith quickstart

Fixes https://github.com/BerriAI/litellm/issues/6861

* fix: suppress linting error

* LiteLLM Minor Fixes & Improvements (11/29/2024)  (#6965)

* fix(factory.py): ensure tool call converts image url

Fixes https://github.com/BerriAI/litellm/issues/6953

* fix(transformation.py): support mp4 + pdf url's for vertex ai

Fixes https://github.com/BerriAI/litellm/issues/6936

* fix(http_handler.py): mask gemini api key in error logs

Fixes https://github.com/BerriAI/litellm/issues/6963

* docs(prometheus.md): update prometheus FAQs

* feat(auth_checks.py): ensure specific model access > wildcard model access

if wildcard model is in access group, but specific model is not - deny access

* fix(auth_checks.py): handle auth checks for team based model access groups

handles scenario where model access group used for wildcard models

* fix(internal_user_endpoints.py): support adding guardrails on `/user/update`

Fixes https://github.com/BerriAI/litellm/issues/6942

* fix(key_management_endpoints.py): fix prepare_metadata_fields helper

* fix: fix tests

* build(requirements.txt): bump openai dep version

fixes proxies argument

* test: fix tests

* fix(http_handler.py): fix error message masking

* fix(bedrock_guardrails.py): pass in prepped data

* test: fix test

* test: fix nvidia nim test

* fix(http_handler.py): return original response headers

* fix: revert maskedhttpstatuserror

* test: update tests

* test: cleanup test

* fix(key_management_endpoints.py): fix metadata field update logic

* fix(key_management_endpoints.py): maintain initial order of guardrails in key update

* fix(key_management_endpoints.py): handle prepare metadata

* fix: fix linting errors

* fix: fix linting errors

* fix: fix linting errors

* fix: fix key management errors

* fix(key_management_endpoints.py): update metadata

* test: update test

* refactor: add more debug statements

* test: skip flaky test

* test: fix test

* fix: fix test

* fix: fix update metadata logic

* fix: fix test

* ci(config.yml): change db url for e2e ui testing

* test: add more debug logs to langsmith

* fix: test change

* build(config.yml): fix db url

'
2024-12-02 21:03:33 -08:00
Ishaan Jaff
a6da3dea03
(feat) dd logger - set tags according to the values set by those env vars (#6933)
* dd logger, inherit from .envs

* test_datadog_payload_environment_variables

* fix _get_datadog_service
2024-11-26 22:08:04 -08:00
Ishaan Jaff
4bc06392db
(feat) log proxy auth errors on datadog (#6931)
* add new dd type for auth errors

* add async_log_proxy_authentication_errors

* fix comment

* use async_log_proxy_authentication_errors

* test_datadog_post_call_failure_hook

* test_async_log_proxy_authentication_errors
2024-11-26 20:26:57 -08:00
Ishaan Jaff
aea68cbeb6
(feat) DataDog Logger - Add Failure logging + use Standard Logging payload (#6929)
* add async_log_failure_event for dd

* use standard logging payload for DD logging

* use standard logging payload for DD

* fix use SLP status

* allow opting into _create_v0_logging_payload

* add unit tests for DD logging payload

* fix dd logging tests
2024-11-26 19:27:06 -08:00
Krish Dholakia
7e9d8b58f6
LiteLLM Minor Fixes & Improvements (11/23/2024) (#6870)
* feat(pass_through_endpoints/): support logging anthropic/gemini pass through calls to langfuse/s3/etc.

* fix(utils.py): allow disabling end user cost tracking with new param

Allows proxy admin to disable cost tracking for end user - keeps prometheus metrics small

* docs(configs.md): add disable_end_user_cost_tracking reference to docs

* feat(key_management_endpoints.py): add support for restricting access to `/key/generate` by team/proxy level role

Enables admin to restrict key creation, and assign team admins to handle distributing keys

* test(test_key_management.py): add unit testing for personal / team key restriction checks

* docs: add docs on restricting key creation

* docs(finetuned_models.md): add new guide on calling finetuned models

* docs(input.md): cleanup anthropic supported params

Closes https://github.com/BerriAI/litellm/issues/6856

* test(test_embedding.py): add test for passing extra headers via embedding

* feat(cohere/embed): pass client to async embedding

* feat(rerank.py): add `/v1/rerank` if missing for cohere base url

Closes https://github.com/BerriAI/litellm/issues/6844

* fix(main.py): pass extra_headers param to openai

Fixes https://github.com/BerriAI/litellm/issues/6836

* fix(litellm_logging.py): don't disable global callbacks when dynamic callbacks are set

Fixes issue where global callbacks - e.g. prometheus were overriden when langfuse was set dynamically

* fix(handler.py): fix linting error

* fix: fix typing

* build: add conftest to proxy_admin_ui_tests/

* test: fix test

* fix: fix linting errors

* test: fix test

* fix: fix pass through testing
2024-11-23 15:17:40 +05:30
Ishaan Jaff
7463dab9c6
(feat) provider budget routing improvements (#6827)
* minor fix for provider budget

* fix raise good error message when budget crossed for provider budget

* fix test provider budgets

* test provider budgets

* feat - emit llm provider spend on prometheus

* test_prometheus_metric_tracking

* doc provider budgets
2024-11-19 21:25:08 -08:00
Krish Dholakia
3beecfb0d4
LiteLLM Minor Fixes & Improvements (11/13/2024) (#6729)
* fix(utils.py): add logprobs support for together ai

Fixes

https://github.com/BerriAI/litellm/issues/6724

* feat(pass_through_endpoints/): add anthropic/ pass-through endpoint

adds new `anthropic/` pass-through endpoint + refactors docs

* feat(spend_management_endpoints.py): allow /global/spend/report to query team + customer id

enables seeing spend for a customer in a team

* Add integration with MLflow Tracing (#6147)

* Add MLflow logger

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* Streaming handling

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* lint

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* address comments and fix issues

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* address comments and fix issues

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* Move logger construction code

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* Add docs

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* async handlers

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* new picture

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

* fix(mlflow.py): fix ruff linting errors

* ci(config.yml): add mlflow to ci testing

* fix: fix test

* test: fix test

* Litellm key update fix (#6710)

* fix(caching): convert arg to equivalent kwargs in llm caching handler

prevent unexpected errors

* fix(caching_handler.py): don't pass args to caching

* fix(caching): remove all *args from caching.py

* fix(caching): consistent function signatures + abc method

* test(caching_unit_tests.py): add unit tests for llm caching

ensures coverage for common caching scenarios across different implementations

* refactor(litellm_logging.py): move to using cache key from hidden params instead of regenerating one

* fix(router.py): drop redis password requirement

* fix(proxy_server.py): fix faulty slack alerting check

* fix(langfuse.py): avoid copying functions/thread lock objects in metadata

fixes metadata copy error when parent otel span in metadata

* test: update test

* fix(key_management_endpoints.py): fix /key/update with metadata update

* fix(key_management_endpoints.py): fix key_prepare_update helper

* fix(key_management_endpoints.py): reset value to none if set in key update

* fix: update test

'

* Litellm dev 11 11 2024 (#6693)

* fix(__init__.py): add 'watsonx_text' as mapped llm api route

Fixes https://github.com/BerriAI/litellm/issues/6663

* fix(opentelemetry.py): fix passing parallel tool calls to otel

Fixes https://github.com/BerriAI/litellm/issues/6677

* refactor(test_opentelemetry_unit_tests.py): create a base set of unit tests for all logging integrations - test for parallel tool call handling

reduces bugs in repo

* fix(__init__.py): update provider-model mapping to include all known provider-model mappings

Fixes https://github.com/BerriAI/litellm/issues/6669

* feat(anthropic): support passing document in llm api call

* docs(anthropic.md): add pdf anthropic call to docs + expose new 'supports_pdf_input' function

* fix(factory.py): fix linting error

* add clear doc string for GCS bucket logging

* Add docs to export logs to Laminar (#6674)

* Add docs to export logs to Laminar

* minor fix: newline at end of file

* place laminar after http and grpc

* (Feat) Add langsmith key based logging (#6682)

* add langsmith_api_key to StandardCallbackDynamicParams

* create a file for langsmith types

* langsmith add key / team based logging

* add key based logging for langsmith

* fix langsmith key based logging

* fix linting langsmith

* remove NOQA violation

* add unit test coverage for all helpers in test langsmith

* test_langsmith_key_based_logging

* docs langsmith key based logging

* run langsmith tests in logging callback tests

* fix logging testing

* test_langsmith_key_based_logging

* test_add_callback_via_key_litellm_pre_call_utils_langsmith

* add debug statement langsmith key based logging

* test_langsmith_key_based_logging

* (fix) OpenAI's optional messages[].name  does not work with Mistral API  (#6701)

* use helper for _transform_messages mistral

* add test_message_with_name to base LLMChat test

* fix linting

* add xAI on Admin UI (#6680)

* (docs) add benchmarks on 1K RPS  (#6704)

* docs litellm proxy benchmarks

* docs GCS bucket

* doc fix - reduce clutter on logging doc title

* (feat) add cost tracking stable diffusion 3 on Bedrock  (#6676)

* add cost tracking for sd3

* test_image_generation_bedrock

* fix get model info for image cost

* add cost_calculator for stability 1 models

* add unit testing for bedrock image cost calc

* test_cost_calculator_with_no_optional_params

* add test_cost_calculator_basic

* correctly allow size Optional

* fix cost_calculator

* sd3 unit tests cost calc

* fix raise correct error 404 when /key/info is called on non-existent key  (#6653)

* fix raise correct error on /key/info

* add not_found_error error

* fix key not found in DB error

* use 1 helper for checking token hash

* fix error code on key info

* fix test key gen prisma

* test_generate_and_call_key_info

* test fix test_call_with_valid_model_using_all_models

* fix key info tests

* bump: version 1.52.4 → 1.52.5

* add defaults used for GCS logging

* LiteLLM Minor Fixes & Improvements (11/12/2024)  (#6705)

* fix(caching): convert arg to equivalent kwargs in llm caching handler

prevent unexpected errors

* fix(caching_handler.py): don't pass args to caching

* fix(caching): remove all *args from caching.py

* fix(caching): consistent function signatures + abc method

* test(caching_unit_tests.py): add unit tests for llm caching

ensures coverage for common caching scenarios across different implementations

* refactor(litellm_logging.py): move to using cache key from hidden params instead of regenerating one

* fix(router.py): drop redis password requirement

* fix(proxy_server.py): fix faulty slack alerting check

* fix(langfuse.py): avoid copying functions/thread lock objects in metadata

fixes metadata copy error when parent otel span in metadata

* test: update test

* bump: version 1.52.5 → 1.52.6

* (feat) helm hook to sync db schema  (#6715)

* v0 migration job

* fix job

* fix migrations job.yml

* handle standalone DB on helm hook

* fix argo cd annotations

* fix db migration helm hook

* fix migration job

* doc fix Using Http/2 with Hypercorn

* (fix proxy redis) Add redis sentinel support  (#6154)

* add sentinel_password support

* add doc for setting redis sentinel password

* fix redis sentinel - use sentinel password

* Fix: Update gpt-4o costs to that of gpt-4o-2024-08-06 (#6714)

Fixes #6713

* (fix) using Anthropic `response_format={"type": "json_object"}`  (#6721)

* add support for response_format=json anthropic

* add test_json_response_format to baseLLM ChatTest

* fix test_litellm_anthropic_prompt_caching_tools

* fix test_anthropic_function_call_with_no_schema

* test test_create_json_tool_call_for_response_format

* (feat) Add cost tracking for Azure Dall-e-3 Image Generation  + use base class to ensure basic image generation tests pass  (#6716)

* add BaseImageGenTest

* use 1 class for unit testing

* add debugging to BaseImageGenTest

* TestAzureOpenAIDalle3

* fix response_cost_calculator

* test_basic_image_generation

* fix img gen basic test

* fix _select_model_name_for_cost_calc

* fix test_aimage_generation_bedrock_with_optional_params

* fix undo changes cost tracking

* fix response_cost_calculator

* fix test_cost_azure_gpt_35

* fix remove dup test (#6718)

* (build) update db helm hook

* (build) helm db pre sync hook

* (build) helm db sync hook

* test: run test_team_logging firdst

---------

Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Dinmukhamed Mailibay <47117969+dinmukhamedm@users.noreply.github.com>
Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>

* test: update test

* test: skip anthropic overloaded error

* test: cleanup test

* test: update tests

* test: fix test

* test: handle gemini overloaded model error

* test: handle internal server error

* test: handle anthropic overloaded error

* test: handle claude instability

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com>
Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>
Co-authored-by: Dinmukhamed Mailibay <47117969+dinmukhamedm@users.noreply.github.com>
Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>
2024-11-15 11:18:31 +05:30
Krish Dholakia
9160d80fa5
LiteLLM Minor Fixes & Improvements (11/12/2024) (#6705)
* fix(caching): convert arg to equivalent kwargs in llm caching handler

prevent unexpected errors

* fix(caching_handler.py): don't pass args to caching

* fix(caching): remove all *args from caching.py

* fix(caching): consistent function signatures + abc method

* test(caching_unit_tests.py): add unit tests for llm caching

ensures coverage for common caching scenarios across different implementations

* refactor(litellm_logging.py): move to using cache key from hidden params instead of regenerating one

* fix(router.py): drop redis password requirement

* fix(proxy_server.py): fix faulty slack alerting check

* fix(langfuse.py): avoid copying functions/thread lock objects in metadata

fixes metadata copy error when parent otel span in metadata

* test: update test
2024-11-12 22:50:51 +05:30
Ishaan Jaff
c3bc9e6b12
(Feat) Add langsmith key based logging (#6682)
* add langsmith_api_key to StandardCallbackDynamicParams

* create a file for langsmith types

* langsmith add key / team based logging

* add key based logging for langsmith

* fix langsmith key based logging

* fix linting langsmith

* remove NOQA violation

* add unit test coverage for all helpers in test langsmith

* test_langsmith_key_based_logging

* docs langsmith key based logging

* run langsmith tests in logging callback tests

* fix logging testing

* test_langsmith_key_based_logging

* test_add_callback_via_key_litellm_pre_call_utils_langsmith

* add debug statement langsmith key based logging

* test_langsmith_key_based_logging
2024-11-11 13:58:06 -08:00
Ishaan Jaff
d889bce0c4 add clear doc string for GCS bucket logging 2024-11-11 11:49:44 -08:00
Krish Dholakia
f59cb46e71
Litellm dev 11 11 2024 (#6693)
* fix(__init__.py): add 'watsonx_text' as mapped llm api route

Fixes https://github.com/BerriAI/litellm/issues/6663

* fix(opentelemetry.py): fix passing parallel tool calls to otel

Fixes https://github.com/BerriAI/litellm/issues/6677

* refactor(test_opentelemetry_unit_tests.py): create a base set of unit tests for all logging integrations - test for parallel tool call handling

reduces bugs in repo

* fix(__init__.py): update provider-model mapping to include all known provider-model mappings

Fixes https://github.com/BerriAI/litellm/issues/6669

* feat(anthropic): support passing document in llm api call

* docs(anthropic.md): add pdf anthropic call to docs + expose new 'supports_pdf_input' function

* fix(factory.py): fix linting error
2024-11-12 00:16:35 +05:30
Ishaan Jaff
eb92ed4156
(Feat) 273% improvement GCS Bucket Logger - use Batched Logging (#6679)
* use CustomBatchLogger for GCS

* add GCS bucket logging type

* use batch logging for GCs bucket

* add gcs_bucket

* allow setting flush_interval on CustomBatchLogger

* set GCS_FLUSH_INTERVAL to 1s

* fix test_key_logging

* fix test_key_logging

* add docs on new env vars
2024-11-11 11:35:34 +05:30
Ishaan Jaff
61026e189d
(feat) Add support for logging to GCS Buckets with folder paths (#6675)
* use helper to log

* gcs _handle_folders_in_bucket_name

* add test_basic_gcs_logger_with_folder_in_bucket_name

* run gcs testing in logging callback tests

* include correct deps

* fix gcs bucket logging test

* fix test_basic_gcs_logger_with_folder_in_bucket_name

* fix test_get_gcs_logging_config_without_service_account

* fix test gcs bucket

* remove unused file
2024-11-08 19:24:18 -08:00
Krish Dholakia
73531f4815
Litellm dev 11 08 2024 (#6658)
* fix(deepseek/chat): convert content list to str

Fixes https://github.com/BerriAI/litellm/issues/6642

* test(test_deepseek_completion.py): implement base llm unit tests

increase robustness across providers

* fix(router.py): support content policy violation fallbacks with default fallbacks

* fix(opentelemetry.py): refactor to move otel imports behing flag

Fixes https://github.com/BerriAI/litellm/issues/6636

* fix(opentelemtry.py): close span on success completion

* fix(user_api_key_auth.py): allow user_role to default to none

* fix: mark flaky test

* fix(opentelemetry.py): move otelconfig.from_env to inside the init

prevent otel errors raised just by importing the litellm class

* fix(user_api_key_auth.py): fix auth error
2024-11-08 22:07:17 +05:30
Ishaan Jaff
eb47117800
(feat) log error class, function_name on prometheus service failure hook + only log DB related failures on DB service hook (#6650)
* log error on prometheus service failure hook

* use a more accurate function name for wrapper that handles logging db metrics

* fix log_db_metrics

* test_log_db_metrics_failure_error_types

* fix linting

* fix auth checks
2024-11-07 17:01:18 -08:00
Krish Dholakia
27e18358ab
fix(pattern_match_deployments.py): default to user input if unable to… (#6632)
* fix(pattern_match_deployments.py): default to user input if unable to map based on wildcards

* test: fix test

* test: reset test name

* test: update conftest to reload proxy server module between tests

* ci(config.yml): move langfuse out of local_testing

reduce ci/cd time

* ci(config.yml): cleanup langfuse ci/cd tests

* fix: update test to not use global proxy_server app module

* ci: move caching to a separate test pipeline

speed up ci pipeline

* test: update conftest to check if proxy_server attr exists before reloading

* build(conftest.py): don't block on inability to reload proxy_server

* ci(config.yml): update caching unit test filter to work on 'cache' keyword as well

* fix(encrypt_decrypt_utils.py): use function to get salt key

* test: mark flaky test

* test: handle anthropic overloaded errors

* refactor: create separate ci/cd pipeline for proxy unit tests

make ci/cd faster

* ci(config.yml): add litellm_proxy_unit_testing to build_and_test jobs

* ci(config.yml): generate prisma binaries for proxy unit tests

* test: readd vertex_key.json

* ci(config.yml): remove `-s` from proxy_unit_test cmd

speed up test

* ci: remove any 'debug' logging flag

speed up ci pipeline

* test: fix test

* test(test_braintrust.py): rerun

* test: add delay for braintrust test
2024-11-08 00:55:57 +05:30