litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-25 10:44:24 +00:00

Author	SHA1	Message	Date
Krish Dholakia	7b27cfb0ae	Support temporary budget increases on keys (#7754 ) * fix(gpt_transformation.py): fix response_format translation check for 4o models Fixes https://github.com/BerriAI/litellm/issues/7616 * feat(key_management_endpoints.py): support 'temp_budget_increase' and 'temp_budget_expiry' fields Allow proxy admin to grant temporary budget increases to keys * fix(proxy/_types.py): enforce temp_budget_increase and temp_budget_expiry are always passed together * feat(user_api_key_auth.py): initial working temp budget increase logic ensures key budget exceeded error checks for temp budget in key metadata * feat(proxy_server.py): return the key max budget and key spend in the response headers Allows clientside user to know their remaining limits * test: add unit testing for new proxy utils Ensures new key budget is correctly handled * docs(temporary_budget_increase.md): add doc on temporary budget increase * fix(utils.py): remove 3.5 from response_format check for now not all azure 3.5 models support response_format * fix(user_api_key_auth.py): return valid user api key auth object on all paths	2025-01-14 17:03:11 -08:00
Krish Dholakia	ec5a354eac	add azure o1 pricing (#7715 ) * build(model_prices_and_context_window.json): add azure o1 pricing Closes https://github.com/BerriAI/litellm/issues/7712 * refactor: replace regex with string method for whitespace check in stop-sequences handling (#7713) * Allows overriding keep_alive time in ollama (#7079) * Allows overriding keep_alive time in ollama * Also adds to ollama_chat * Adds some info on the docs about this parameter * fix: together ai warning (#7688) Co-authored-by: Carl Senze <carl.senze@aleph-alpha.com> * fix(proxy_server.py): handle config containing thread locked objects when using get_config_state * fix(proxy_server.py): add exception to debug * build(model_prices_and_context_window.json): update 'supports_vision' for azure o1 --------- Co-authored-by: Wolfram Ravenwolf <52386626+WolframRavenwolf@users.noreply.github.com> Co-authored-by: Regis David Souza Mesquita <github@rdsm.dev> Co-authored-by: Carl <45709281+capsenz@users.noreply.github.com> Co-authored-by: Carl Senze <carl.senze@aleph-alpha.com>	2025-01-12 18:15:35 -08:00
Ishaan Jaff	02f5c44a35	[Bug fix]: Proxy Auth Layer - Allow Azure Realtime routes as llm_api_routes (#7684 ) * fix route check azure realtime endpoints * test_is_llm_api_route * fix /realtime * test_routes_on_litellm_proxy	2025-01-10 20:38:06 -08:00
Krish Dholakia	c10ae8879e	fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini p… (#7660 ) * fix(vertex_ai/gemini/transformation.py): handle 'http://' in gemini process url * refactor(router.py): refactor '_prompt_management_factory' to use logging obj get_chat_completion logic deduplicates code * fix(litellm_logging.py): update 'get_chat_completion_prompt' to update logging object messages * docs(prompt_management.md): update prompt management to be in beta given feedback - this still needs to be revised (e.g. passing in user message, not ignoring) * refactor(prompt_management_base.py): introduce base class for prompt management allows consistent behaviour across prompt management integrations * feat(prompt_management_base.py): support adding client message to template message + refactor langfuse prompt management to use prompt management base * fix(litellm_logging.py): log prompt id + prompt variables to langfuse if set allows tracking what prompt was used for what purpose * feat(litellm_logging.py): log prompt management metadata in standard logging payload + use in langfuse allows logging prompt id / prompt variables to langfuse * test: fix test * fix(router.py): cleanup unused imports * fix: fix linting error * fix: fix trace param typing * fix: fix linting errors * fix: fix code qa check	2025-01-10 07:31:59 -08:00
Krish Dholakia	63926f484c	feat(ui_sso.py): Allows users to use test key pane, and have team budget limits be enforced for their use-case (#7666 )	2025-01-09 22:12:45 -08:00
Krish Dholakia	07c5f136f1	fix(utils.py): fix select tokenizer for custom tokenizer (#7599 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 36s Details * fix(utils.py): fix select tokenizer for custom tokenizer * fix(router.py): fix 'utils/token_counter' endpoint	2025-01-07 22:37:09 -08:00
Ishaan Jaff	6125ba1e2b	(Feat) - allow including dd-trace in litellm base image (#7587 ) * introduce USE_DDTRACE=true * update dd tracer * update * bump dd trace * use og slim image * DD tracing * fix _init_dd_tracer	2025-01-06 17:27:09 -08:00
Ishaan Jaff	716efd5fad	(fix proxy perf) use `_read_request_body` instead of ast.literal_eval to get better performance (#7545 ) * fix ast literal eval * run ci/cd again	2025-01-03 17:48:32 -08:00
Krish Dholakia	6843f3a2bb	Revert "fix: add missing parameters order, limit, before, and after in get_as…" (#7542 ) This reverts commit `4b0505dffd`.	2025-01-03 16:32:12 -08:00
Jean Carlo de Souza	4b0505dffd	fix: add missing parameters order, limit, before, and after in get_assistants method for openai (#7537 ) - Ensured that `before` and `after` parameters are only passed when provided to avoid AttributeError. - Implemented safe access using default values for `before` and `after` to prevent missing attribute issues. - Added consistent handling of `order` and `limit` to improve flexibility and robustness in API calls.	2025-01-03 14:41:54 -08:00
Krish Dholakia	33f301ec86	Litellm dev 01 02 2025 p1 (#7516 ) * fix(redact_messages.py): fix redact messages for non-model response input to be dictionary fixes issue with otel logging when message redaction is enabled * fix(proxy_server.py): fix langfuse key leak in exception string * test: fix test * test: fix test * test: fix tests	2025-01-03 14:40:57 -08:00
Ishaan Jaff	cf60444916	(Feat) Add support for reading secrets from Hashicorp vault (#7497 ) * HashicorpSecretManager * test_hashicorp_secret_managerv * use 1 helper initialize_secret_manager * add HASHICORP_VAULT * working config * hcorp read_secret * HashicorpSecretManager * add secret_manager_testing * use 1 folder for secret manager testing * test_hashicorp_secret_manager_get_secret * HashicorpSecretManager * docs HCP secrets * update folder name * docs hcorp secret manager * remove unused imports * add conftest.py * fix tests * docs document env vars	2025-01-01 18:35:05 -08:00
Ishaan Jaff	03b1db5a7d	(Feat) - Add PagerDuty Alerting Integration (#7478 ) * define basic types * fix verbose_logger.exception statement * fix basic alerting * test pager duty alerting * test_pagerduty_alerting_high_failure_rate * PagerDutyAlerting * async_log_failure_event * use pre_call_hook * add _request_is_completed helper util * update AlertingConfig * rename PagerDutyInternalEvent * _send_alert_if_thresholds_crossed * use pagerduty as _custom_logger_compatible_callbacks_literal * fix slack alerting imports * fix imports in slack alerting * PagerDutyAlerting * fix _load_alerting_settings * test_pagerduty_hanging_request_alerting * working pager duty alerting * fix linting * doc pager duty alerting * update hanging_response_handler * fix import location * update failure_threshold * update async_pre_call_hook * docs pagerduty * test - callback_class_str_to_classType * fix linting errors * fix linting + testing error * PagerDutyAlerting * test_pagerduty_hanging_request_alerting * fix unused imports * docs pager duty * @pytest.mark.flaky(retries=6, delay=2) * test_model_info_bedrock_converse_enforcement	2025-01-01 07:12:51 -08:00
Krish Dholakia	39cbd9d878	Litellm dev 12 31 2024 p1 (#7488 ) * fix(internal_user_endpoints.py): fix team list sort - handle team_alias being set + None * fix(key_management_endpoints.py): allow team admin to create key for member via admin ui Fixes https://github.com/BerriAI/litellm/issues/7482 * fix(proxy_server.py): allow querying info on specific model group via `/model_group/info` allows client-side user to get model info from proxy * fix(proxy_server.py): add docstring on `/model_group/info` showing how to filter by model name * test(test_proxy_utils.py): add unit test for returning model group info filtered * fix(proxy_server.py): fix query param * fix(test_Get_model_info.py): handle no whitelisted bedrock modells	2024-12-31 23:21:51 -08:00
Krish Dholakia	080de89cfb	Fix team-based logging to langfuse + allow custom tokenizer on `/token_counter` endpoint (#7493 ) * fix(langfuse_prompt_management.py): migrate dynamic logging to langfuse custom logger compatible class * fix(langfuse_prompt_management.py): support failure callback logging to langfuse as well * feat(proxy_server.py): support setting custom tokenizer on config.yaml Allows customizing value for `/utils/token_counter` * fix(proxy_server.py): fix linting errors * test: skip if file not found * style: cleanup unused import * docs(configs.md): add docs on setting custom tokenizer	2024-12-31 23:18:41 -08:00
Ishaan Jaff	3158dcf88b	(Security fix) - Upgrade to `fastapi==0.115.5` (#7447 ) * fix upgrade fast api * bump fastapi * update a proxy startup tests * remove unused test file * update tests * bump fast api	2024-12-28 17:08:19 -08:00
Krish Dholakia	539f166166	Support budget/rate limit tiers for keys (#7429 ) * feat(proxy/utils.py): get associated litellm budget from db in combined_view for key allows user to create rate limit tiers and associate those to keys * feat(proxy/_types.py): update the value of key-level tpm/rpm/model max budget metrics with the associated budget table values if set allows rate limit tiers to be easily applied to keys * docs(rate_limit_tiers.md): add doc on setting rate limit / budget tiers make feature discoverable * feat(key_management_endpoints.py): return litellm_budget_table value in key generate make it easy for user to know associated budget on key creation * fix(key_management_endpoints.py): document 'budget_id' param in `/key/generate` * docs(key_management_endpoints.py): document budget_id usage * refactor(budget_management_endpoints.py): refactor budget endpoints into separate file - makes it easier to run documentation testing against it * docs(test_api_docs.py): add budget endpoints to ci/cd doc test + add missing param info to docs * fix(customer_endpoints.py): use new pydantic obj name * docs(user_management_heirarchy.md): add simple doc explaining teams/keys/org/users on litellm * Litellm dev 12 26 2024 p2 (#7432) * (Feat) Add logging for `POST v1/fine_tuning/jobs` (#7426) * init commit ft jobs logging * add ft logging * add logging for FineTuningJob * simple FT Job create test * (docs) - show all supported Azure OpenAI endpoints in overview (#7428) * azure batches * update doc * docs azure endpoints * docs endpoints on azure * docs azure batches api * docs azure batches api * fix(key_management_endpoints.py): fix key update to actually work * test(test_key_management.py): add e2e test asserting ui key update call works * fix: proxy/_types - fix linting erros * test: update test --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> * fix: test * fix(parallel_request_limiter.py): enforce tpm/rpm limits on key from tiers * fix: fix linting errors * test: fix test * fix: remove unused import * test: update test * docs(customer_endpoints.py): document new model_max_budget param * test: specify unique key alias * docs(budget_management_endpoints.py): document new model_max_budget param * test: fix test * test: fix tests --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>	2024-12-26 19:05:27 -08:00
Krish Dholakia	2e86a4806d	Litellm dev 12 24 2024 p2 (#7400 ) * fix(utils.py): default custom_llm_provider=None for 'supports_response_schema' Closes https://github.com/BerriAI/litellm/issues/7397 * refactor(langfuse/): call langfuse logger inside customlogger compatible langfuse class, refactor langfuse logger to use verbose_logger.debug instead of print_verbose * refactor(litellm_pre_call_utils.py): move config based team callbacks inside dynamic team callback logic enables simpler unit testing for config-based team callbacks * fix(proxy/_types.py): handle teamcallbackmetadata - none values drop none values if present. if all none, use default dict to avoid downstream errors * test(test_proxy_utils.py): add unit test preventing future issues - asserts team_id in config state not popped off across calls Fixes https://github.com/BerriAI/litellm/issues/6787 * fix(langfuse_prompt_management.py): add success + failure logging event support * fix: fix linting error * test: fix test * test: fix test * test: override o1 prompt caching - openai currently not working * test: fix test	2024-12-24 20:33:41 -08:00
Ishaan Jaff	47e12802df	(feat) `/batches` Add support for using `/batches` endpoints in OAI format (#7402 ) * run azure testing on ci/cd * update docs on azure batches endpoints * add input azure.jsonl * refactor - use separate file for batches endpoints * fixes for passing custom llm provider to /batch endpoints * pass custom llm provider to files endpoints * update azure batches doc * add info for azure batches api * update batches endpoints * use simple helper for raising proxy exception * update config.yml * fix imports * update tests * use existing settings * update env var used * update configs * update config.yml * update ft testing	2024-12-24 16:58:05 -08:00
Krish Dholakia	db59e08958	Litellm dev 12 23 2024 p1 (#7383 ) * feat(guardrails_endpoint.py): new `/guardrails/list` endpoint Allow users to view what the available guardrails are * docs: document new `/guardrails/list` endpoint * docs(enterprise.md): update docs * fix(openai/transcription/handler.py): support cost tracking on vtt + srt formats * fix(openai/transcriptions/handler.py): default to 'verbose_json' response format if 'text' or 'json' response_format received. ensures 'duration' param is received for all audio transcription requests * fix: fix linting errors * fix: remove unused import	2024-12-23 16:33:31 -08:00
Krish Dholakia	a8ae2f551a	Litellm enforce enterprise features (#7357 ) * fix(proxy_server.py): enforce team id based model add only works if enterprise user * fix(auth_checks.py): enforce common_checks can only be imported by user_api_key_auth.py * fix(auth_checks.py): insert not premium user error message on failed common checks run	2024-12-21 19:14:13 -08:00
Ishaan Jaff	ce41cd977c	(Admin UI) - Test Key Tab - Allow using `UI Session` instead of manually creating a virtual key (#7348 ) * ui fix - allow searching model list + fix bug on filtering * qa fix - use correct provider name for azure_text * ui wrap content onto next line * ui fix - allow selecting current UI session when logging in * ui session budgets	2024-12-21 13:14:15 -08:00
Krish Dholakia	4c7a3931b7	Litellm dev 12 19 2024 p2 (#7315 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 46s Details * fix(proxy_server.py): only update k,v pair if v is not empty/null Fixes https://github.com/BerriAI/litellm/issues/6787 * test(test_router.py): cleanup duplicate calls * test: add new test stream options drop params test * test: update optional params / stream options test to test for vertex ai mistral route specifically Addresses https://github.com/BerriAI/litellm/issues/7309 * fix(proxy_server.py): fix linting errors * fix: fix linting errors	2024-12-19 20:28:16 -08:00
Ishaan Jaff	c7f14e936a	(code quality) run ruff rule to ban unused imports (#7313 ) * remove unused imports * fix AmazonConverseConfig * fix test * fix import * ruff check fixes * test fixes * fix testing * fix imports	2024-12-19 12:33:42 -08:00
Ishaan Jaff	4b2958b8b8	fix use 1 file _PROXY_track_cost_callback (#7304 )	2024-12-18 19:46:26 -08:00
Ishaan Jaff	6261ec3599	(feat proxy) v2 - model max budgets (#7302 ) * clean up unused code * add _PROXY_VirtualKeyModelMaxBudgetLimiter * adjust type imports * working _PROXY_VirtualKeyModelMaxBudgetLimiter * fix user_api_key_model_max_budget * fix user_api_key_model_max_budget * update naming * update naming * fix changes to RouterBudgetLimiting * test_call_with_key_over_model_budget * test_call_with_key_over_model_budget * handle _get_request_model_budget_config * e2e test for test_call_with_key_over_model_budget * clean up test * run ci/cd again * add validate_model_max_budget * docs fix * update doc * add e2e testing for _PROXY_VirtualKeyModelMaxBudgetLimiter * test_unit_test_max_model_budget_limiter.py	2024-12-18 19:42:46 -08:00
Krish Dholakia	0fe8bfe87a	fix(proxy_server.py): pass model access groups to get_key/get_team mo… (#7281 ) * fix(proxy_server.py): pass model access groups to get_key/get_team models allows end user to see actual models they have access to, instead of default models * fix(auth_checks.py): fix linting errors * fix: fix linting errors	2024-12-18 09:33:33 -08:00
Krish Dholakia	516c2a6a70	Litellm remove circular imports (#7232 ) * fix(utils.py): initial commit to remove circular imports - moves llmproviders to utils.py * fix(router.py): fix 'litellm.EmbeddingResponse' import from router.py ' * refactor: fix litellm.ModelResponse import on pass through endpoints * refactor(litellm_logging.py): fix circular import for custom callbacks literal * fix(factory.py): fix circular imports inside prompt factory * fix(cost_calculator.py): fix circular import for 'litellm.Usage' * fix(proxy_server.py): fix potential circular import with `litellm.Router' * fix(proxy/utils.py): fix potential circular import in `litellm.Router` * fix: remove circular imports in 'auth_checks' and 'guardrails/' * fix(prompt_injection_detection.py): fix router impor t * fix(vertex_passthrough_logging_handler.py): fix potential circular imports in vertex pass through * fix(anthropic_pass_through_logging_handler.py): fix potential circular imports * fix(slack_alerting.py-+-ollama_chat.py): fix modelresponse import * fix(base.py): fix potential circular import * fix(handler.py): fix potential circular ref in codestral + cohere handler's * fix(azure.py): fix potential circular imports * fix(gpt_transformation.py): fix modelresponse import * fix(litellm_logging.py): add logging base class - simplify typing makes it easy for other files to type check the logging obj without introducing circular imports * fix(azure_ai/embed): fix potential circular import on handler.py * fix(databricks/): fix potential circular imports in databricks/ * fix(vertex_ai/): fix potential circular imports on vertex ai embeddings * fix(vertex_ai/image_gen): fix import * fix(watsonx-+-bedrock): cleanup imports * refactor(anthropic-pass-through-+-petals): cleanup imports * refactor(huggingface/): cleanup imports * fix(ollama-+-clarifai): cleanup circular imports * fix(openai_like/): fix impor t * fix(openai_like/): fix embedding handler cleanup imports * refactor(openai.py): cleanup imports * fix(sagemaker/transformation.py): fix import * ci(config.yml): add circular import test to ci/cd	2024-12-14 16:28:34 -08:00
Krish Dholakia	ec36353b41	fix(main.py): fix retries being multiplied when using openai sdk (#7221 ) * fix(main.py): fix retries being multiplied when using openai sdk Closes https://github.com/BerriAI/litellm/pull/7130 * docs(prompt_management.md): add langfuse prompt management doc * feat(team_endpoints.py): allow teams to add their own models Enables teams to call their own finetuned models via the proxy * test: add better enforcement check testing for `/model/new` now that teams can add their own models * docs(team_model_add.md): tutorial for allowing teams to add their own models * test: fix test	2024-12-14 11:56:55 -08:00
Krish Dholakia	e68bb4e051	Litellm dev 12 12 2024 (#7203 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 47s Details * fix(azure/): support passing headers to azure openai endpoints Fixes https://github.com/BerriAI/litellm/issues/6217 * fix(utils.py): move default tokenizer to just openai hf tokenizer makes network calls when trying to get the tokenizer - this slows down execution time calls * fix(router.py): fix pattern matching router - add generic "" to it as well Fixes issue where generic "" model access group wouldn't show up * fix(pattern_match_deployments.py): match to more specific pattern match to more specific pattern allows setting generic wildcard model access group and excluding specific models more easily * fix(proxy_server.py): fix _delete_deployment to handle base case where db_model list is empty don't delete all router models b/c of empty list Fixes https://github.com/BerriAI/litellm/issues/7196 * fix(anthropic/): fix handling response_format for anthropic messages with anthropic api * fix(fireworks_ai/): support passing response_format + tool call in same message Addresses https://github.com/BerriAI/litellm/issues/7135 * Revert "fix(fireworks_ai/): support passing response_format + tool call in same message" This reverts commit `6a30dc6929`. * test: fix test * fix(replicate/): fix replicate default retry/polling logic * test: add unit testing for router pattern matching * test: update test to use default oai tokenizer * test: mark flaky test * test: skip flaky test	2024-12-13 08:54:03 -08:00
Ishaan Jaff	b889d7c72f	(feat) UI - Disable Usage Tab once SpendLogs is 1M+ Rows (#7208 ) * use utils to set proxy spend logs row count * store proxy state variables * fix check for _has_user_setup_sso * fix proxyStateVariables * fix dup code * rename getProxyUISettings * add fixes * ui emit num spend logs rows * test_proxy_server_prisma_setup * use MAX_SPENDLOG_ROWS_TO_QUERY to constants * test_get_ui_settings_spend_logs_threshold	2024-12-12 18:43:17 -08:00
Krish Dholakia	e4493248ae	Litellm dev 12 06 2024 (#7067 ) * fix(edit_budget_modal.tsx): call `/budget/update` endpoint instead of `/budget/new` allows updating existing budget on ui * fix(user_api_key_auth.py): support cost tracking for end user via jwt field * fix(presidio.py): support pii masking on sync logging callbacks enables masking before logging to langfuse * feat(utils.py): support retry policy logic inside '.completion()' Fixes https://github.com/BerriAI/litellm/issues/6623 * fix(utils.py): support retry by retry policy on async logic as well * fix(handle_jwt.py): set leeway default leeway value * test: fix test to handle jwt audience claim	2024-12-06 22:44:18 -08:00
Krish Dholakia	816f0ef8d2	LiteLLM Minor Fixes & Improvements (12/05/2024) (#7051 ) * fix(cost_calculator.py): move to using `.get_model_info()` for cost per token calculations ensures cost tracking is reliable - handles edge cases of parsing model cost map * build(model_prices_and_context_window.json): add 'supports_response_schema' for select tgai models Fixes https://github.com/BerriAI/litellm/pull/7037#discussion_r1872157329 * build(model_prices_and_context_window.json): remove 'pdf input' and 'vision' support from nova micro in model map Bedrock docs indicate no support for micro - https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference-supported-models-features.html * fix(converse_transformation.py): support amazon nova tool use * fix(opentelemetry): Add missing LLM request type attribute to spans (#7041) * feat(opentelemetry): add LLM request type attribute to spans * lint * fix: curl usage (#7038) curl -d, --data <data> is lowercase d curl -D, --dump-header <filename> is uppercase D references: https://curl.se/docs/manpage.html#-d https://curl.se/docs/manpage.html#-D * fix(spend_tracking.py): handle empty 'id' in model response - when creating spend log Fixes https://github.com/BerriAI/litellm/issues/7023 * fix(streaming_chunk_builder.py): handle initial id being empty string Fixes https://github.com/BerriAI/litellm/issues/7023 * fix(anthropic_passthrough_logging_handler.py): add end user cost tracking for anthropic pass through endpoint * docs(pass_through/): refactor docs location + add table on supported features for pass through endpoints * feat(anthropic_passthrough_logging_handler.py): support end user cost tracking via anthropic sdk * docs(anthropic_completion.md): add docs on passing end user param for cost tracking on anthropic sdk * fix(litellm_logging.py): use standard logging payload if present in kwargs prevent datadog logging error for pass through endpoints * docs(bedrock.md): add rerank api usage example to docs * bugfix/change dummy tool name format (#7053) * fix viewing keys (#7042) * ui new build * build(model_prices_and_context_window.json): add bedrock region models to model cost map (#7044) * bye (#6982) * (fix) litellm router.aspeech (#6962) * doc Migrating Databases * fix aspeech on router * test_audio_speech_router * test_audio_speech_router * docs show supported providers on batches api doc * change dummy tool name format --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Krish Dholakia <krrishdholakia@gmail.com> Co-authored-by: yujonglee <yujonglee.dev@gmail.com> * fix: fix linting errors * test: update test * fix(litellm_logging.py): fix pass through check * fix(test_otel_logging.py): fix test * fix(cost_calculator.py): update handling for cost per second * fix(cost_calculator.py): fix cost check * test: fix test * (fix) adding public routes when using custom header (#7045) * get_api_key_from_custom_header * add test_get_api_key_from_custom_header * fix testing use 1 file for test user api key auth * fix test user api key auth * test_custom_api_key_header_name * build: update ui build --------- Co-authored-by: Doron Kopit <83537683+doronkopit5@users.noreply.github.com> Co-authored-by: lloydchang <lloydchang@gmail.com> Co-authored-by: hgulersen <haymigulersen@gmail.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: yujonglee <yujonglee.dev@gmail.com>	2024-12-06 14:29:53 -08:00
Ishaan Jaff	84db69d4c4	(feat) add Vertex Batches API support in OpenAI format (#7032 ) * working request * working transform * working request * transform vertex batch response * add _async_create_batch * move gcs functions to base * fix _get_content_from_openai_file * transform_openai_file_content_to_vertex_ai_file_content * fix transform vertex gcs bucket upload to OAI files format * working e2e test * _get_gcs_object_name * fix linting * add doc string * fix transform_gcs_bucket_response_to_openai_file_object * use vertex for batch endpoints * add batches support for vertex * test_vertex_batches_endpoint * test_vertex_batch_prediction * fix gcs bucket base auth * docs clean up batches * docs Batch API * docs vertex batches api * test_get_gcs_logging_config_without_service_account * undo change * fix vertex md * test_get_gcs_logging_config_without_service_account * ci/cd run again	2024-12-04 19:40:28 -08:00
Ishaan Jaff	05f810922c	(feat) Allow disabling ErrorLogs written to the DB (#6940 ) * fix - allow disabling logging error logs * docs on disabling error logs * doc string for _PROXY_failure_handler * test_disable_error_logs * rename file * fix rename file * increase test coverage for test_enable_error_logs	2024-11-27 19:34:51 -08:00
Ishaan Jaff	eba700a491	Revert "Revert "(feat) Allow using include to include external YAML files in a config.yaml (#6922 )"" This reverts commit `5d13302e6b`.	2024-11-27 16:08:59 -08:00
Krish Dholakia	21156ff5d0	LiteLLM Minor Fixes & Improvements (11/27/2024) (#6943 ) * fix(http_parsing_utils.py): remove `ast.literal_eval()` from http utils Security fix - https://huntr.com/bounties/96a32812-213c-4819-ba4e-36143d35e95b?token=bf414bbd77f8b346556e 64ab2dd9301ea44339910877ea50401c76f977e36cdd78272f5fb4ca852a88a7e832828aae1192df98680544ee24aa98f3cf6980d8 bab641a66b7ccbc02c0e7d4ddba2db4dbe7318889dc0098d8db2d639f345f574159814627bb084563bad472e2f990f825bff0878a9 e281e72c88b4bc5884d637d186c0d67c9987c57c3f0caf395aff07b89ad2b7220d1dd7d1b427fd2260b5f01090efce5250f8b56ea2 c0ec19916c24b23825d85ce119911275944c840a1340d69e23ca6a462da610 * fix(converse/transformation.py): support bedrock apac cross region inference Fixes https://github.com/BerriAI/litellm/issues/6905 * fix(user_api_key_auth.py): add auth check for websocket endpoint Fixes https://github.com/BerriAI/litellm/issues/6926 * fix(user_api_key_auth.py): use `model` from query param * fix: fix linting error * test: run flaky tests first	2024-11-28 00:32:46 +05:30
Ishaan Jaff	5d13302e6b	Revert "(feat) Allow using include to include external YAML files in a config.yaml (#6922 )" This reverts commit `68e59824a3`.	2024-11-27 10:17:09 -08:00
Ishaan Jaff	68e59824a3	(feat) Allow using include to include external YAML files in a config.yaml (#6922 ) * add helper to process inlcudes directive on yaml * add doc on config management * unit tests for `include` on config.yaml	2024-11-26 20:27:12 -08:00
Krish Dholakia	8673f2541e	fix(key_management_endpoints.py): fix user-membership check when creating team key (#6890 ) * fix(key_management_endpoints.py): fix user-membership check when creating team key * docs: add deprecation notice on original `/v1/messages` endpoint + add better swagger tags on pass-through endpoints * fix(gemini/): fix image_url handling for gemini Fixes https://github.com/BerriAI/litellm/issues/6897 * fix(teams.tsx): fix member add when role is 'user' * fix(team_endpoints.py): /team/member_add fix adding several new members to team * test(test_vertex.py): remove redundant test * test(test_proxy_server.py): fix team member add tests	2024-11-26 14:19:24 +05:30
Ishaan Jaff	34bfebe470	(QOL improvement) Provider budget routing - allow using 1s, 1d, 1mo, 2mo etc (#6885 ) * use 1 file for duration_in_seconds * add to readme.md * re use duration_in_seconds * fix importing _extract_from_regex, get_last_day_of_month * fix import * update provider budget routing * fix - remove dup test	2024-11-23 16:59:46 -08:00
Krish Dholakia	7e9d8b58f6	LiteLLM Minor Fixes & Improvements (11/23/2024) (#6870 ) * feat(pass_through_endpoints/): support logging anthropic/gemini pass through calls to langfuse/s3/etc. * fix(utils.py): allow disabling end user cost tracking with new param Allows proxy admin to disable cost tracking for end user - keeps prometheus metrics small * docs(configs.md): add disable_end_user_cost_tracking reference to docs * feat(key_management_endpoints.py): add support for restricting access to `/key/generate` by team/proxy level role Enables admin to restrict key creation, and assign team admins to handle distributing keys * test(test_key_management.py): add unit testing for personal / team key restriction checks * docs: add docs on restricting key creation * docs(finetuned_models.md): add new guide on calling finetuned models * docs(input.md): cleanup anthropic supported params Closes https://github.com/BerriAI/litellm/issues/6856 * test(test_embedding.py): add test for passing extra headers via embedding * feat(cohere/embed): pass client to async embedding * feat(rerank.py): add `/v1/rerank` if missing for cohere base url Closes https://github.com/BerriAI/litellm/issues/6844 * fix(main.py): pass extra_headers param to openai Fixes https://github.com/BerriAI/litellm/issues/6836 * fix(litellm_logging.py): don't disable global callbacks when dynamic callbacks are set Fixes issue where global callbacks - e.g. prometheus were overriden when langfuse was set dynamically * fix(handler.py): fix linting error * fix: fix typing * build: add conftest to proxy_admin_ui_tests/ * test: fix test * fix: fix linting errors * test: fix test * fix: fix pass through testing	2024-11-23 15:17:40 +05:30
Ishaan Jaff	a7d5536872	(fix) passthrough - allow internal users to access /anthropic (#6843 ) * fix /anthropic/ * test llm_passthrough_router * fix test_gemini_pass_through_endpoint	2024-11-21 11:46:50 -08:00
Ishaan Jaff	cc1f8ff0ba	(testing) - add e2e tests for anthropic pass through endpoints (#6840 ) * tests - add e2e tests for anthropic pass through * fix swagger * fix pass through tests	2024-11-20 17:55:13 -08:00
Ishaan Jaff	1890fde3f3	(Proxy) add support for DOCS_URL and REDOC_URL (#6806 ) * add support for DOCS_URL and REDOC_URL * document env vars * add unit tests for docs url and redocs url	2024-11-19 07:02:12 -08:00
Ishaan Jaff	51ffe93e77	(docs) add docstrings for all /key, /user, /team, /customer endpoints (#6804 ) * use helper to handle_exception_on_proxy * add doc string for /key/regenerate * use 1 helper for handle_exception_on_proxy * add doc string for /key/block * add doc string for /key/unblock * remove deprecated function * remove deprecated endpoints * remove incorrect tag for endpoint * fix linting * fix /key/regenerate * fix regen key * fix use port 4000 for user endpoints * fix clean up - use separate file for customer endpoints * add docstring for user/update * fix imports * doc string /user/list * doc string for /team/delete * fix team block endpoint * fix import block user * add doc string for /team/unblock * add doc string for /team/list * add doc string for /team/info * add doc string for key endpoints * fix customer_endpoints * add doc string for customer endpoints * fix import new_end_user * fix testing * fix import new_end_user * fix add check for allow_user_auth	2024-11-18 19:44:06 -08:00
Krish Dholakia	3beecfb0d4	LiteLLM Minor Fixes & Improvements (11/13/2024) (#6729 ) * fix(utils.py): add logprobs support for together ai Fixes https://github.com/BerriAI/litellm/issues/6724 * feat(pass_through_endpoints/): add anthropic/ pass-through endpoint adds new `anthropic/` pass-through endpoint + refactors docs * feat(spend_management_endpoints.py): allow /global/spend/report to query team + customer id enables seeing spend for a customer in a team * Add integration with MLflow Tracing (#6147) * Add MLflow logger Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * Streaming handling Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * lint Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * address comments and fix issues Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * address comments and fix issues Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * Move logger construction code Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * Add docs Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * async handlers Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * new picture Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * fix(mlflow.py): fix ruff linting errors * ci(config.yml): add mlflow to ci testing * fix: fix test * test: fix test * Litellm key update fix (#6710) * fix(caching): convert arg to equivalent kwargs in llm caching handler prevent unexpected errors * fix(caching_handler.py): don't pass args to caching * fix(caching): remove all args from caching.py fix(caching): consistent function signatures + abc method * test(caching_unit_tests.py): add unit tests for llm caching ensures coverage for common caching scenarios across different implementations * refactor(litellm_logging.py): move to using cache key from hidden params instead of regenerating one * fix(router.py): drop redis password requirement * fix(proxy_server.py): fix faulty slack alerting check * fix(langfuse.py): avoid copying functions/thread lock objects in metadata fixes metadata copy error when parent otel span in metadata * test: update test * fix(key_management_endpoints.py): fix /key/update with metadata update * fix(key_management_endpoints.py): fix key_prepare_update helper * fix(key_management_endpoints.py): reset value to none if set in key update * fix: update test ' * Litellm dev 11 11 2024 (#6693) * fix(__init__.py): add 'watsonx_text' as mapped llm api route Fixes https://github.com/BerriAI/litellm/issues/6663 * fix(opentelemetry.py): fix passing parallel tool calls to otel Fixes https://github.com/BerriAI/litellm/issues/6677 * refactor(test_opentelemetry_unit_tests.py): create a base set of unit tests for all logging integrations - test for parallel tool call handling reduces bugs in repo * fix(__init__.py): update provider-model mapping to include all known provider-model mappings Fixes https://github.com/BerriAI/litellm/issues/6669 * feat(anthropic): support passing document in llm api call * docs(anthropic.md): add pdf anthropic call to docs + expose new 'supports_pdf_input' function * fix(factory.py): fix linting error * add clear doc string for GCS bucket logging * Add docs to export logs to Laminar (#6674) * Add docs to export logs to Laminar * minor fix: newline at end of file * place laminar after http and grpc * (Feat) Add langsmith key based logging (#6682) * add langsmith_api_key to StandardCallbackDynamicParams * create a file for langsmith types * langsmith add key / team based logging * add key based logging for langsmith * fix langsmith key based logging * fix linting langsmith * remove NOQA violation * add unit test coverage for all helpers in test langsmith * test_langsmith_key_based_logging * docs langsmith key based logging * run langsmith tests in logging callback tests * fix logging testing * test_langsmith_key_based_logging * test_add_callback_via_key_litellm_pre_call_utils_langsmith * add debug statement langsmith key based logging * test_langsmith_key_based_logging * (fix) OpenAI's optional messages[].name does not work with Mistral API (#6701) * use helper for _transform_messages mistral * add test_message_with_name to base LLMChat test * fix linting * add xAI on Admin UI (#6680) * (docs) add benchmarks on 1K RPS (#6704) * docs litellm proxy benchmarks * docs GCS bucket * doc fix - reduce clutter on logging doc title * (feat) add cost tracking stable diffusion 3 on Bedrock (#6676) * add cost tracking for sd3 * test_image_generation_bedrock * fix get model info for image cost * add cost_calculator for stability 1 models * add unit testing for bedrock image cost calc * test_cost_calculator_with_no_optional_params * add test_cost_calculator_basic * correctly allow size Optional * fix cost_calculator * sd3 unit tests cost calc * fix raise correct error 404 when /key/info is called on non-existent key (#6653) * fix raise correct error on /key/info * add not_found_error error * fix key not found in DB error * use 1 helper for checking token hash * fix error code on key info * fix test key gen prisma * test_generate_and_call_key_info * test fix test_call_with_valid_model_using_all_models * fix key info tests * bump: version 1.52.4 → 1.52.5 * add defaults used for GCS logging * LiteLLM Minor Fixes & Improvements (11/12/2024) (#6705) * fix(caching): convert arg to equivalent kwargs in llm caching handler prevent unexpected errors * fix(caching_handler.py): don't pass args to caching * fix(caching): remove all args from caching.py fix(caching): consistent function signatures + abc method * test(caching_unit_tests.py): add unit tests for llm caching ensures coverage for common caching scenarios across different implementations * refactor(litellm_logging.py): move to using cache key from hidden params instead of regenerating one * fix(router.py): drop redis password requirement * fix(proxy_server.py): fix faulty slack alerting check * fix(langfuse.py): avoid copying functions/thread lock objects in metadata fixes metadata copy error when parent otel span in metadata * test: update test * bump: version 1.52.5 → 1.52.6 * (feat) helm hook to sync db schema (#6715) * v0 migration job * fix job * fix migrations job.yml * handle standalone DB on helm hook * fix argo cd annotations * fix db migration helm hook * fix migration job * doc fix Using Http/2 with Hypercorn * (fix proxy redis) Add redis sentinel support (#6154) * add sentinel_password support * add doc for setting redis sentinel password * fix redis sentinel - use sentinel password * Fix: Update gpt-4o costs to that of gpt-4o-2024-08-06 (#6714) Fixes #6713 * (fix) using Anthropic `response_format={"type": "json_object"}` (#6721) * add support for response_format=json anthropic * add test_json_response_format to baseLLM ChatTest * fix test_litellm_anthropic_prompt_caching_tools * fix test_anthropic_function_call_with_no_schema * test test_create_json_tool_call_for_response_format * (feat) Add cost tracking for Azure Dall-e-3 Image Generation + use base class to ensure basic image generation tests pass (#6716) * add BaseImageGenTest * use 1 class for unit testing * add debugging to BaseImageGenTest * TestAzureOpenAIDalle3 * fix response_cost_calculator * test_basic_image_generation * fix img gen basic test * fix _select_model_name_for_cost_calc * fix test_aimage_generation_bedrock_with_optional_params * fix undo changes cost tracking * fix response_cost_calculator * fix test_cost_azure_gpt_35 * fix remove dup test (#6718) * (build) update db helm hook * (build) helm db pre sync hook * (build) helm db sync hook * test: run test_team_logging firdst --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Dinmukhamed Mailibay <47117969+dinmukhamedm@users.noreply.github.com> Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de> * test: update test * test: skip anthropic overloaded error * test: cleanup test * test: update tests * test: fix test * test: handle gemini overloaded model error * test: handle internal server error * test: handle anthropic overloaded error * test: handle claude instability --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Dinmukhamed Mailibay <47117969+dinmukhamedm@users.noreply.github.com> Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>	2024-11-15 11:18:31 +05:30
Ishaan Jaff	f8e700064e	(Feat) Add support for storing virtual keys in AWS SecretManager (#6728 ) * add SecretManager to httpxSpecialProvider * fix importing AWSSecretsManagerV2 * add unit testing for writing keys to AWS secret manager * use KeyManagementEventHooks for key/generated events * us event hooks for key management endpoints * working AWSSecretsManagerV2 * fix write secret to AWS secret manager on /key/generate * fix KeyManagementSettings * use tasks for key management hooks * add async_delete_secret * add test for async_delete_secret * use _delete_virtual_keys_from_secret_manager * fix test secret manager * test_key_generate_with_secret_manager_call * fix check for key_management_settings * sync_read_secret * test_aws_secret_manager * fix sync_read_secret * use helper to check when _should_read_secret_from_secret_manager * test_get_secret_with_access_mode * test - handle eol model claude-2, use claude-2.1 instead * docs AWS secret manager * fix test_read_nonexistent_secret * fix test_supports_response_schema * ci/cd run again	2024-11-14 09:25:07 -08:00
Krish Dholakia	9160d80fa5	LiteLLM Minor Fixes & Improvements (11/12/2024) (#6705 ) * fix(caching): convert arg to equivalent kwargs in llm caching handler prevent unexpected errors * fix(caching_handler.py): don't pass args to caching * fix(caching): remove all args from caching.py fix(caching): consistent function signatures + abc method * test(caching_unit_tests.py): add unit tests for llm caching ensures coverage for common caching scenarios across different implementations * refactor(litellm_logging.py): move to using cache key from hidden params instead of regenerating one * fix(router.py): drop redis password requirement * fix(proxy_server.py): fix faulty slack alerting check * fix(langfuse.py): avoid copying functions/thread lock objects in metadata fixes metadata copy error when parent otel span in metadata * test: update test	2024-11-12 22:50:51 +05:30
Ishaan Jaff	eb47117800	(feat) log error class, function_name on prometheus service failure hook + only log DB related failures on DB service hook (#6650 ) * log error on prometheus service failure hook * use a more accurate function name for wrapper that handles logging db metrics * fix log_db_metrics * test_log_db_metrics_failure_error_types * fix linting * fix auth checks	2024-11-07 17:01:18 -08:00

1 2 3 4 5 ...

1934 commits