litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-27 03:34:10 +00:00

Author	SHA1	Message	Date
Krish Dholakia	fbdd88d79c	test: initial test to enforce all functions in user_api_key_auth.py h… (#7797 ) * test: initial test to enforce all functions in user_api_key_auth.py have direct testing * test(test_user_api_key_auth.py): add is_allowed_route unit test * test(test_user_api_key_auth.py): add more tests * test(test_user_api_key_auth.py): add complete testing coverage for all functions in `user_api_key_auth.py` * test(test_db_schema_changes.py): add a unit test to ensure all db schema changes are backwards compatible gives user an easy rollback path * test: fix schema compatibility test filepath * test: fix test	2025-01-15 21:52:45 -08:00
Krish Dholakia	543655adc7	Litellm dev 01 14 2025 p2 (#7772 ) * feat(pass_through_endpoints.py): fix anthropic end user cost tracking * fix(anthropic/chat/transformation.py): use returned provider model for anthropic handles anthropic `-latest` tag in request body throwing cost calculation errors ensures we can be accurate in our model cost tracking * feat(model_prices_and_context_window.json): add gemini-2.0-flash-thinking-exp pricing * test: update test to use assumption that user_api_key_dict can get anthropic user id * test: fix test * fix: fix test * fix(anthropic_pass_through.py): uncomment previous anthropic end-user cost tracking code block can't guarantee user api key dict always has end user id - too many code paths * fix(user_api_key_auth.py): this allows end user id from request body to always be read and set in auth object * fix(auth_check.py): fix linting error * test: fix auth check * fix(auth_utils.py): fix get end user id to handle metadata = None	2025-01-15 21:34:50 -08:00
Krish Dholakia	d7a13ad561	Support temporary budget increases on keys (#7754 ) * fix(gpt_transformation.py): fix response_format translation check for 4o models Fixes https://github.com/BerriAI/litellm/issues/7616 * feat(key_management_endpoints.py): support 'temp_budget_increase' and 'temp_budget_expiry' fields Allow proxy admin to grant temporary budget increases to keys * fix(proxy/_types.py): enforce temp_budget_increase and temp_budget_expiry are always passed together * feat(user_api_key_auth.py): initial working temp budget increase logic ensures key budget exceeded error checks for temp budget in key metadata * feat(proxy_server.py): return the key max budget and key spend in the response headers Allows clientside user to know their remaining limits * test: add unit testing for new proxy utils Ensures new key budget is correctly handled * docs(temporary_budget_increase.md): add doc on temporary budget increase * fix(utils.py): remove 3.5 from response_format check for now not all azure 3.5 models support response_format * fix(user_api_key_auth.py): return valid user api key auth object on all paths	2025-01-14 17:03:11 -08:00
Ishaan Jaff	76586f175d	latency fix _cache_key_object (#7676 )	2025-01-10 13:59:26 -08:00
Krish Dholakia	b77832a793	Litellm dev 01 08 2025 p1 (#7640 ) * feat(ui_sso.py): support reading team ids from sso token * feat(ui_sso.py): working upsert sso user teams membership in litellm - if team exists Adds user to relevant teams, if user is part of teams and team exists on litellm * fix(ui_sso.py): safely handle add team member task * build(ui/): support setting team id when creating team on UI * build(ui/): teams.tsx allow setting team id on ui * build(circle_ci/requirements.txt): add fastapi-sso to ci/cd testing * fix: fix linting errors	2025-01-08 22:08:20 -08:00
Ishaan Jaff	818f5b0113	fix is llm api route check (#7631 )	2025-01-08 18:45:59 -08:00
Ishaan Jaff	a4007e3294	(Feat) soft budget alerts on keys (#7623 ) * class WebhookEvent(CallInfo): Add * handle soft budget alerts * handle soft budget * fix budget alerts * fix CallInfo * fix _get_user_info_str * test_soft_budget_alerts * test_soft_budget_alert	2025-01-07 21:36:34 -08:00
Ishaan Jaff	1bea935889	fix _return_user_api_key_auth_obj (#7591 )	2025-01-06 16:43:14 -08:00
Krish Dholakia	b52beffeb0	LiteLLM Minor Fixes & Improvements (12/27/2024) - p1 (#7448 ) * feat(main.py): mock_response() - support 'litellm.ContextWindowExceededError' in mock response enabled quicker router/fallback/proxy debug on context window errors * feat(exception_mapping_utils.py): extract special litellm errors from error str if calling `litellm_proxy/` as provider Closes https://github.com/BerriAI/litellm/issues/7259 * fix(user_api_key_auth.py): specify 'Received Proxy Server Request' is span kind server Closes https://github.com/BerriAI/litellm/issues/7298	2024-12-27 19:04:39 -08:00
Krish Dholakia	d6a2beb342	Support budget/rate limit tiers for keys (#7429 ) * feat(proxy/utils.py): get associated litellm budget from db in combined_view for key allows user to create rate limit tiers and associate those to keys * feat(proxy/_types.py): update the value of key-level tpm/rpm/model max budget metrics with the associated budget table values if set allows rate limit tiers to be easily applied to keys * docs(rate_limit_tiers.md): add doc on setting rate limit / budget tiers make feature discoverable * feat(key_management_endpoints.py): return litellm_budget_table value in key generate make it easy for user to know associated budget on key creation * fix(key_management_endpoints.py): document 'budget_id' param in `/key/generate` * docs(key_management_endpoints.py): document budget_id usage * refactor(budget_management_endpoints.py): refactor budget endpoints into separate file - makes it easier to run documentation testing against it * docs(test_api_docs.py): add budget endpoints to ci/cd doc test + add missing param info to docs * fix(customer_endpoints.py): use new pydantic obj name * docs(user_management_heirarchy.md): add simple doc explaining teams/keys/org/users on litellm * Litellm dev 12 26 2024 p2 (#7432) * (Feat) Add logging for `POST v1/fine_tuning/jobs` (#7426) * init commit ft jobs logging * add ft logging * add logging for FineTuningJob * simple FT Job create test * (docs) - show all supported Azure OpenAI endpoints in overview (#7428) * azure batches * update doc * docs azure endpoints * docs endpoints on azure * docs azure batches api * docs azure batches api * fix(key_management_endpoints.py): fix key update to actually work * test(test_key_management.py): add e2e test asserting ui key update call works * fix: proxy/_types - fix linting erros * test: update test --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> * fix: test * fix(parallel_request_limiter.py): enforce tpm/rpm limits on key from tiers * fix: fix linting errors * test: fix test * fix: remove unused import * test: update test * docs(customer_endpoints.py): document new model_max_budget param * test: specify unique key alias * docs(budget_management_endpoints.py): document new model_max_budget param * test: fix test * test: fix tests --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com>	2024-12-26 19:05:27 -08:00
Ishaan Jaff	0d2fac7182	fix if "/openai/" in route:	2024-12-25 21:11:08 -08:00
Krish Dholakia	ae7f54498f	Litellm enforce enterprise features (#7357 ) * fix(proxy_server.py): enforce team id based model add only works if enterprise user * fix(auth_checks.py): enforce common_checks can only be imported by user_api_key_auth.py * fix(auth_checks.py): insert not premium user error message on failed common checks run	2024-12-21 19:14:13 -08:00
Ishaan Jaff	49ea75830a	(Admin UI) correctly render provider name in /models with wildcard routing (#7349 ) * ui fix - allow searching model list + fix bug on filtering * qa fix - use correct provider name for azure_text * ui wrap content onto next line * ui fix - allow selecting current UI session when logging in * ui session budgets * ui show provider models on wildcard models * test provider name appears in model list * ui fix auto scroll on chat ui tab	2024-12-21 14:19:12 -08:00
Krish Dholakia	61b4c41c3c	Litellm dev 12 20 2024 p3 (#7339 ) * fix(proxy_track_cost_callback.py): log to db if only end user param given * fix: allows for jwt-auth based end user id spend tracking to work * fix(utils.py): fix 'get_end_user_id_for_cost_tracking' to use 'user_api_key_end_user_id' more stable - works with jwt-auth based end user tracking as well * test(test_jwt.py): add e2e unit test to confirm end user cost tracking works for spend logs * test: update test to use end_user api key hash param * fix(langfuse.py): support end user cost tracking via jwt auth + langfuse logs end user to langfuse if decoded from jwt token * fix: fix linting errors * test: fix test * test: fix test * fix: fix end user id extraction * fix: run test earlier	2024-12-20 21:13:32 -08:00
Ishaan Jaff	62a1cdec47	(code quality) run ruff rule to ban unused imports (#7313 ) * remove unused imports * fix AmazonConverseConfig * fix test * fix import * ruff check fixes * test fixes * fix testing * fix imports	2024-12-19 12:33:42 -08:00
Ishaan Jaff	6220e17ebf	(feat proxy) v2 - model max budgets (#7302 ) * clean up unused code * add _PROXY_VirtualKeyModelMaxBudgetLimiter * adjust type imports * working _PROXY_VirtualKeyModelMaxBudgetLimiter * fix user_api_key_model_max_budget * fix user_api_key_model_max_budget * update naming * update naming * fix changes to RouterBudgetLimiting * test_call_with_key_over_model_budget * test_call_with_key_over_model_budget * handle _get_request_model_budget_config * e2e test for test_call_with_key_over_model_budget * clean up test * run ci/cd again * add validate_model_max_budget * docs fix * update doc * add e2e testing for _PROXY_VirtualKeyModelMaxBudgetLimiter * test_unit_test_max_model_budget_limiter.py	2024-12-18 19:42:46 -08:00
Krish Dholakia	e7918f097b	fix(proxy_server.py): pass model access groups to get_key/get_team mo… (#7281 ) * fix(proxy_server.py): pass model access groups to get_key/get_team models allows end user to see actual models they have access to, instead of default models * fix(auth_checks.py): fix linting errors * fix: fix linting errors	2024-12-18 09:33:33 -08:00
Krish Dholakia	edbf5eeeb3	Litellm remove circular imports (#7232 ) * fix(utils.py): initial commit to remove circular imports - moves llmproviders to utils.py * fix(router.py): fix 'litellm.EmbeddingResponse' import from router.py ' * refactor: fix litellm.ModelResponse import on pass through endpoints * refactor(litellm_logging.py): fix circular import for custom callbacks literal * fix(factory.py): fix circular imports inside prompt factory * fix(cost_calculator.py): fix circular import for 'litellm.Usage' * fix(proxy_server.py): fix potential circular import with `litellm.Router' * fix(proxy/utils.py): fix potential circular import in `litellm.Router` * fix: remove circular imports in 'auth_checks' and 'guardrails/' * fix(prompt_injection_detection.py): fix router impor t * fix(vertex_passthrough_logging_handler.py): fix potential circular imports in vertex pass through * fix(anthropic_pass_through_logging_handler.py): fix potential circular imports * fix(slack_alerting.py-+-ollama_chat.py): fix modelresponse import * fix(base.py): fix potential circular import * fix(handler.py): fix potential circular ref in codestral + cohere handler's * fix(azure.py): fix potential circular imports * fix(gpt_transformation.py): fix modelresponse import * fix(litellm_logging.py): add logging base class - simplify typing makes it easy for other files to type check the logging obj without introducing circular imports * fix(azure_ai/embed): fix potential circular import on handler.py * fix(databricks/): fix potential circular imports in databricks/ * fix(vertex_ai/): fix potential circular imports on vertex ai embeddings * fix(vertex_ai/image_gen): fix import * fix(watsonx-+-bedrock): cleanup imports * refactor(anthropic-pass-through-+-petals): cleanup imports * refactor(huggingface/): cleanup imports * fix(ollama-+-clarifai): cleanup circular imports * fix(openai_like/): fix impor t * fix(openai_like/): fix embedding handler cleanup imports * refactor(openai.py): cleanup imports * fix(sagemaker/transformation.py): fix import * ci(config.yml): add circular import test to ci/cd	2024-12-14 16:28:34 -08:00
Ishaan Jaff	73dcbf8d4e	(proxy) - Auth fix, ensure re-using safe request body for checking `model` field (#7222 ) * litellm fix auth check * fix _read_request_body * test_auth_with_form_data_and_model * fix auth check * fix _read_request_body * fix _safe_get_request_headers	2024-12-14 12:01:25 -08:00
Krish Dholakia	a42f008cd0	Litellm dev 12 12 2024 (#7203 ) * fix(azure/): support passing headers to azure openai endpoints Fixes https://github.com/BerriAI/litellm/issues/6217 * fix(utils.py): move default tokenizer to just openai hf tokenizer makes network calls when trying to get the tokenizer - this slows down execution time calls * fix(router.py): fix pattern matching router - add generic "" to it as well Fixes issue where generic "" model access group wouldn't show up * fix(pattern_match_deployments.py): match to more specific pattern match to more specific pattern allows setting generic wildcard model access group and excluding specific models more easily * fix(proxy_server.py): fix _delete_deployment to handle base case where db_model list is empty don't delete all router models b/c of empty list Fixes https://github.com/BerriAI/litellm/issues/7196 * fix(anthropic/): fix handling response_format for anthropic messages with anthropic api * fix(fireworks_ai/): support passing response_format + tool call in same message Addresses https://github.com/BerriAI/litellm/issues/7135 * Revert "fix(fireworks_ai/): support passing response_format + tool call in same message" This reverts commit `6a30dc6929`. * test: fix test * fix(replicate/): fix replicate default retry/polling logic * test: add unit testing for router pattern matching * test: update test to use default oai tokenizer * test: mark flaky test * test: skip flaky test	2024-12-13 08:54:03 -08:00
Ishaan Jaff	b1c3e2d4ef	(feat) UI - Disable Usage Tab once SpendLogs is 1M+ Rows (#7208 ) * use utils to set proxy spend logs row count * store proxy state variables * fix check for _has_user_setup_sso * fix proxyStateVariables * fix dup code * rename getProxyUISettings * add fixes * ui emit num spend logs rows * test_proxy_server_prisma_setup * use MAX_SPENDLOG_ROWS_TO_QUERY to constants * test_get_ui_settings_spend_logs_threshold	2024-12-12 18:43:17 -08:00
Ishaan Jaff	b78eb6654d	(bug fix) SpendLogs update DB catch all possible DB errors for retrying (#7082 ) * catch DB_CONNECTION_ERROR_TYPES * fix DB retry mechanism for SpendLog updates * use DB_CONNECTION_ERROR_TYPES in auth checks * fix exp back off for writing SpendLogs * use _raise_failed_update_spend_exception to ensure errors print as NON blocking * test_update_spend_logs_multiple_batches_with_failure	2024-12-07 15:59:53 -08:00
Krish Dholakia	df3da2e5d2	Litellm dev 12 06 2024 (#7067 ) * fix(edit_budget_modal.tsx): call `/budget/update` endpoint instead of `/budget/new` allows updating existing budget on ui * fix(user_api_key_auth.py): support cost tracking for end user via jwt field * fix(presidio.py): support pii masking on sync logging callbacks enables masking before logging to langfuse * feat(utils.py): support retry policy logic inside '.completion()' Fixes https://github.com/BerriAI/litellm/issues/6623 * fix(utils.py): support retry by retry policy on async logic as well * fix(handle_jwt.py): set leeway default leeway value * test: fix test to handle jwt audience claim	2024-12-06 22:44:18 -08:00
Ishaan Jaff	56956fd6e7	(fix) adding public routes when using custom header (#7045 ) * get_api_key_from_custom_header * add test_get_api_key_from_custom_header * fix testing use 1 file for test user api key auth * fix test user api key auth * test_custom_api_key_header_name	2024-12-06 14:17:10 -08:00
Krish Dholakia	a392bd9772	fix(key_management_endpoints.py): override metadata field value on up… (#7008 ) * fix(key_management_endpoints.py): override metadata field value on update allow user to override tags * feat(__init__.py): expose new disable_end_user_cost_tracking_prometheus_only metric allow disabling end user cost tracking on prometheus - fixes cardinality issue * fix(litellm_pre_call_utils.py): add key/team level enforced params Fixes https://github.com/BerriAI/litellm/issues/6652 * fix(key_management_endpoints.py): allow user to pass in `enforced_params` as a top level param on /key/generate and /key/update * docs(enterprise.md): add docs on enforcing required params for llm requests * Add support of Galadriel API (#7005) * fix(router.py): robust retry after handling set retry after time to 0 if >0 healthy deployments. handle base case = 1 deployment * test(test_router.py): fix test * feat(bedrock/): add support for 'nova' models also adds explicit 'converse/' route for simpler routing * fix: fix 'supports_pdf_input' return if model supports pdf input on get_model_info * feat(converse_transformation.py): support bedrock pdf input * docs(document_understanding.md): add document understanding to docs * fix(litellm_pre_call_utils.py): fix linting error * fix(init.py): fix passing of bedrock converse models * feat(bedrock/converse): support 'response_format={"type": "json_object"}' * fix(converse_handler.py): fix linting error * fix(base_llm_unit_tests.py): fix test * fix: fix test * test: fix test * test: fix test * test: remove duplicate test --------- Co-authored-by: h4n0 <4738254+h4n0@users.noreply.github.com>	2024-12-03 23:03:50 -08:00
Ishaan Jaff	93c419868e	(fix) allow gracefully handling DB connection errors on proxy (#7017 ) * fix _handle_failed_db_connection_for_get_key_object * _handle_failed_db_connection_for_get_key_object * test_auth_not_connected_to_db	2024-12-03 19:48:51 -08:00
Ishaan Jaff	204d83b3d1	(fix) logging Auth errors on datadog (#6995 ) * fix get_standard_logging_object_payload * fix async_post_call_failure_hook * fix post_call_failure_hook * fix change * fix _is_proxy_only_error * fix async_post_call_failure_hook * fix getting request body * remove redundant code * use a well named original function name for auth errors * fix logging auth fails on DD * fix using request body * use helper for _handle_logging_proxy_only_error	2024-12-02 23:01:21 -08:00
Krish Dholakia	3766d5dc6f	LiteLLM Minor Fixes & Improvements (11/29/2024) (#6965 ) * fix(factory.py): ensure tool call converts image url Fixes https://github.com/BerriAI/litellm/issues/6953 * fix(transformation.py): support mp4 + pdf url's for vertex ai Fixes https://github.com/BerriAI/litellm/issues/6936 * fix(http_handler.py): mask gemini api key in error logs Fixes https://github.com/BerriAI/litellm/issues/6963 * docs(prometheus.md): update prometheus FAQs * feat(auth_checks.py): ensure specific model access > wildcard model access if wildcard model is in access group, but specific model is not - deny access * fix(auth_checks.py): handle auth checks for team based model access groups handles scenario where model access group used for wildcard models * fix(internal_user_endpoints.py): support adding guardrails on `/user/update` Fixes https://github.com/BerriAI/litellm/issues/6942 * fix(key_management_endpoints.py): fix prepare_metadata_fields helper * fix: fix tests * build(requirements.txt): bump openai dep version fixes proxies argument * test: fix tests * fix(http_handler.py): fix error message masking * fix(bedrock_guardrails.py): pass in prepped data * test: fix test * test: fix nvidia nim test * fix(http_handler.py): return original response headers * fix: revert maskedhttpstatuserror * test: update tests * test: cleanup test * fix(key_management_endpoints.py): fix metadata field update logic * fix(key_management_endpoints.py): maintain initial order of guardrails in key update * fix(key_management_endpoints.py): handle prepare metadata * fix: fix linting errors * fix: fix linting errors * fix: fix linting errors * fix: fix key management errors * fix(key_management_endpoints.py): update metadata * test: update test * refactor: add more debug statements * test: skip flaky test * test: fix test * fix: fix test * fix: fix update metadata logic * fix: fix test * ci(config.yml): change db url for e2e ui testing	2024-12-01 05:24:11 -08:00
Krish Dholakia	436d75260f	LiteLLM Minor Fixes & Improvements (11/27/2024) (#6943 ) * fix(http_parsing_utils.py): remove `ast.literal_eval()` from http utils Security fix - https://huntr.com/bounties/96a32812-213c-4819-ba4e-36143d35e95b?token=bf414bbd77f8b346556e 64ab2dd9301ea44339910877ea50401c76f977e36cdd78272f5fb4ca852a88a7e832828aae1192df98680544ee24aa98f3cf6980d8 bab641a66b7ccbc02c0e7d4ddba2db4dbe7318889dc0098d8db2d639f345f574159814627bb084563bad472e2f990f825bff0878a9 e281e72c88b4bc5884d637d186c0d67c9987c57c3f0caf395aff07b89ad2b7220d1dd7d1b427fd2260b5f01090efce5250f8b56ea2 c0ec19916c24b23825d85ce119911275944c840a1340d69e23ca6a462da610 * fix(converse/transformation.py): support bedrock apac cross region inference Fixes https://github.com/BerriAI/litellm/issues/6905 * fix(user_api_key_auth.py): add auth check for websocket endpoint Fixes https://github.com/BerriAI/litellm/issues/6926 * fix(user_api_key_auth.py): use `model` from query param * fix: fix linting error * test: run flaky tests first	2024-11-28 00:32:46 +05:30
Ishaan Jaff	c3ac98f992	(feat) log proxy auth errors on datadog (#6931 ) * add new dd type for auth errors * add async_log_proxy_authentication_errors * fix comment * use async_log_proxy_authentication_errors * test_datadog_post_call_failure_hook * test_async_log_proxy_authentication_errors	2024-11-26 20:26:57 -08:00
Krish Dholakia	4019ca38b4	fix(key_management_endpoints.py): fix user-membership check when creating team key (#6890 ) * fix(key_management_endpoints.py): fix user-membership check when creating team key * docs: add deprecation notice on original `/v1/messages` endpoint + add better swagger tags on pass-through endpoints * fix(gemini/): fix image_url handling for gemini Fixes https://github.com/BerriAI/litellm/issues/6897 * fix(teams.tsx): fix member add when role is 'user' * fix(team_endpoints.py): /team/member_add fix adding several new members to team * test(test_vertex.py): remove redundant test * test(test_proxy_server.py): fix team member add tests	2024-11-26 14:19:24 +05:30
Ishaan Jaff	2bb2f7b1e1	(feat) Add support for using @google/generative-ai JS with LiteLLM Proxy (#6899 ) * feat - allow using gemini js SDK with LiteLLM * add auth for gemini_proxy_route * basic local test for js * test cost tagging gemini js requests * add js sdk test for gemini with litellm * add docs on gemini JS SDK * run node.js tests * fix google ai studio tests * fix vertex js spend test	2024-11-25 13:13:03 -08:00
Ishaan Jaff	9ef254ff35	(fix) passthrough - allow internal users to access /anthropic (#6843 ) * fix /anthropic/ * test llm_passthrough_router * fix test_gemini_pass_through_endpoint	2024-11-21 11:46:50 -08:00
Krish Dholakia	fe1da228f4	Litellm lm studio embedding params (#6746 ) * fix(ollama.py): fix get model info request Fixes https://github.com/BerriAI/litellm/issues/6703 * feat(anthropic/chat/transformation.py): support passing user id to anthropic via openai 'user' param * docs(anthropic.md): document all supported openai params for anthropic * test: fix tests * fix: fix tests * feat(jina_ai/): add rerank support Closes https://github.com/BerriAI/litellm/issues/6691 * test: handle service unavailable error * fix(handler.py): refactor together ai rerank call * test: update test to handle overloaded error * test: fix test * Litellm router trace (#6742) * feat(router.py): add trace_id to parent functions - allows tracking retry/fallbacks * feat(router.py): log trace id across retry/fallback logic allows grouping llm logs for the same request * test: fix tests * fix: fix test * fix(transformation.py): only set non-none stop_sequences * Litellm router disable fallbacks (#6743) * bump: version 1.52.6 → 1.52.7 * feat(router.py): enable dynamically disabling fallbacks Allows for enabling/disabling fallbacks per key * feat(litellm_pre_call_utils.py): support setting 'disable_fallbacks' on litellm key * test: fix test * fix(exception_mapping_utils.py): map 'model is overloaded' to internal server error * fix(lm_studio/embed): support translating lm studio optional params ' * feat(auth_checks.py): fix auth check inside route - `/team/list` Fixes regression where non-admin w/ user_id=None able to query all teams * docs proxy_budget_rescheduler_min_time * helm run DISABLE_SCHEMA_UPDATE * docs helm pre sync hook * fix migration job.yaml * fix DATABASE_URL * use existing spec for migrations job * fix yaml on migrations job * fix migration job * update doc on pre sync hook * fix migrations-job.yaml * fix migration job * fix prisma migration * test - handle eol model claude-2, use claude-2.1 instead * (docs) add instructions on how to contribute to docker image * Update code blocks huggingface.md (#6737) * Update prefix.md (#6734) * fix test_supports_response_schema * mark Helm PreSyn as BETA * (Feat) Add support for storing virtual keys in AWS SecretManager (#6728) * add SecretManager to httpxSpecialProvider * fix importing AWSSecretsManagerV2 * add unit testing for writing keys to AWS secret manager * use KeyManagementEventHooks for key/generated events * us event hooks for key management endpoints * working AWSSecretsManagerV2 * fix write secret to AWS secret manager on /key/generate * fix KeyManagementSettings * use tasks for key management hooks * add async_delete_secret * add test for async_delete_secret * use _delete_virtual_keys_from_secret_manager * fix test secret manager * test_key_generate_with_secret_manager_call * fix check for key_management_settings * sync_read_secret * test_aws_secret_manager * fix sync_read_secret * use helper to check when _should_read_secret_from_secret_manager * test_get_secret_with_access_mode * test - handle eol model claude-2, use claude-2.1 instead * docs AWS secret manager * fix test_read_nonexistent_secret * fix test_supports_response_schema * ci/cd run again * LiteLLM Minor Fixes & Improvement (11/14/2024) (#6730) * fix(ollama.py): fix get model info request Fixes https://github.com/BerriAI/litellm/issues/6703 * feat(anthropic/chat/transformation.py): support passing user id to anthropic via openai 'user' param * docs(anthropic.md): document all supported openai params for anthropic * test: fix tests * fix: fix tests * feat(jina_ai/): add rerank support Closes https://github.com/BerriAI/litellm/issues/6691 * test: handle service unavailable error * fix(handler.py): refactor together ai rerank call * test: update test to handle overloaded error * test: fix test * Litellm router trace (#6742) * feat(router.py): add trace_id to parent functions - allows tracking retry/fallbacks * feat(router.py): log trace id across retry/fallback logic allows grouping llm logs for the same request * test: fix tests * fix: fix test * fix(transformation.py): only set non-none stop_sequences * Litellm router disable fallbacks (#6743) * bump: version 1.52.6 → 1.52.7 * feat(router.py): enable dynamically disabling fallbacks Allows for enabling/disabling fallbacks per key * feat(litellm_pre_call_utils.py): support setting 'disable_fallbacks' on litellm key * test: fix test * fix(exception_mapping_utils.py): map 'model is overloaded' to internal server error * test: handle gemini error * test: fix test * fix: new run * bump: version 1.52.7 → 1.52.8 * docs: add docs on jina ai rerank support * docs(reliability.md): add tutorial on disabling fallbacks per key * docs(logging.md): add 'trace_id' param to standard logging payload * (feat) add bedrock/stability.stable-image-ultra-v1:0 (#6723) * add stability.stable-image-ultra-v1:0 * add pricing for stability.stable-image-ultra-v1:0 * fix test_supports_response_schema * ci/cd run again * [Feature]: Stop swallowing up AzureOpenAi exception responses in litellm's implementation for a BadRequestError (#6745) * fix azure exceptions * test_bad_request_error_contains_httpx_response * test_bad_request_error_contains_httpx_response * use safe access to get exception response * fix get attr * [Feature]: json_schema in response support for Anthropic (#6748) * _convert_tool_response_to_message * fix ModelResponseIterator * fix test_json_response_format * test_json_response_format_stream * fix _convert_tool_response_to_message * use helper _handle_json_mode_chunk * fix _process_response * unit testing for test_convert_tool_response_to_message_no_arguments * update doc for JSON mode * fix: import audio check (#6740) * fix imagegeneration output_cost_per_image on model cost map (#6752) * (feat) Vertex AI - add support for fine tuned embedding models (#6749) * fix use fine tuned vertex embedding models * test_vertex_embedding_url * add _transform_openai_request_to_fine_tuned_embedding_request * add _transform_openai_request_to_fine_tuned_embedding_request * add transform_openai_request_to_vertex_embedding_request * add _transform_vertex_response_to_openai_for_fine_tuned_models * test_vertexai_embedding for ft models * fix test_vertexai_embedding_finetuned * doc fine tuned / custom embedding models * fix test test_partner_models_httpx * bump: version 1.52.8 → 1.52.9 * LiteLLM Minor Fixes & Improvements (11/13/2024) (#6729) * fix(utils.py): add logprobs support for together ai Fixes https://github.com/BerriAI/litellm/issues/6724 * feat(pass_through_endpoints/): add anthropic/ pass-through endpoint adds new `anthropic/` pass-through endpoint + refactors docs * feat(spend_management_endpoints.py): allow /global/spend/report to query team + customer id enables seeing spend for a customer in a team * Add integration with MLflow Tracing (#6147) * Add MLflow logger Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * Streaming handling Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * lint Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * address comments and fix issues Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * address comments and fix issues Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * Move logger construction code Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * Add docs Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * async handlers Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * new picture Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * fix(mlflow.py): fix ruff linting errors * ci(config.yml): add mlflow to ci testing * fix: fix test * test: fix test * Litellm key update fix (#6710) * fix(caching): convert arg to equivalent kwargs in llm caching handler prevent unexpected errors * fix(caching_handler.py): don't pass args to caching * fix(caching): remove all args from caching.py fix(caching): consistent function signatures + abc method * test(caching_unit_tests.py): add unit tests for llm caching ensures coverage for common caching scenarios across different implementations * refactor(litellm_logging.py): move to using cache key from hidden params instead of regenerating one * fix(router.py): drop redis password requirement * fix(proxy_server.py): fix faulty slack alerting check * fix(langfuse.py): avoid copying functions/thread lock objects in metadata fixes metadata copy error when parent otel span in metadata * test: update test * fix(key_management_endpoints.py): fix /key/update with metadata update * fix(key_management_endpoints.py): fix key_prepare_update helper * fix(key_management_endpoints.py): reset value to none if set in key update * fix: update test ' * Litellm dev 11 11 2024 (#6693) * fix(__init__.py): add 'watsonx_text' as mapped llm api route Fixes https://github.com/BerriAI/litellm/issues/6663 * fix(opentelemetry.py): fix passing parallel tool calls to otel Fixes https://github.com/BerriAI/litellm/issues/6677 * refactor(test_opentelemetry_unit_tests.py): create a base set of unit tests for all logging integrations - test for parallel tool call handling reduces bugs in repo * fix(__init__.py): update provider-model mapping to include all known provider-model mappings Fixes https://github.com/BerriAI/litellm/issues/6669 * feat(anthropic): support passing document in llm api call * docs(anthropic.md): add pdf anthropic call to docs + expose new 'supports_pdf_input' function * fix(factory.py): fix linting error * add clear doc string for GCS bucket logging * Add docs to export logs to Laminar (#6674) * Add docs to export logs to Laminar * minor fix: newline at end of file * place laminar after http and grpc * (Feat) Add langsmith key based logging (#6682) * add langsmith_api_key to StandardCallbackDynamicParams * create a file for langsmith types * langsmith add key / team based logging * add key based logging for langsmith * fix langsmith key based logging * fix linting langsmith * remove NOQA violation * add unit test coverage for all helpers in test langsmith * test_langsmith_key_based_logging * docs langsmith key based logging * run langsmith tests in logging callback tests * fix logging testing * test_langsmith_key_based_logging * test_add_callback_via_key_litellm_pre_call_utils_langsmith * add debug statement langsmith key based logging * test_langsmith_key_based_logging * (fix) OpenAI's optional messages[].name does not work with Mistral API (#6701) * use helper for _transform_messages mistral * add test_message_with_name to base LLMChat test * fix linting * add xAI on Admin UI (#6680) * (docs) add benchmarks on 1K RPS (#6704) * docs litellm proxy benchmarks * docs GCS bucket * doc fix - reduce clutter on logging doc title * (feat) add cost tracking stable diffusion 3 on Bedrock (#6676) * add cost tracking for sd3 * test_image_generation_bedrock * fix get model info for image cost * add cost_calculator for stability 1 models * add unit testing for bedrock image cost calc * test_cost_calculator_with_no_optional_params * add test_cost_calculator_basic * correctly allow size Optional * fix cost_calculator * sd3 unit tests cost calc * fix raise correct error 404 when /key/info is called on non-existent key (#6653) * fix raise correct error on /key/info * add not_found_error error * fix key not found in DB error * use 1 helper for checking token hash * fix error code on key info * fix test key gen prisma * test_generate_and_call_key_info * test fix test_call_with_valid_model_using_all_models * fix key info tests * bump: version 1.52.4 → 1.52.5 * add defaults used for GCS logging * LiteLLM Minor Fixes & Improvements (11/12/2024) (#6705) * fix(caching): convert arg to equivalent kwargs in llm caching handler prevent unexpected errors * fix(caching_handler.py): don't pass args to caching * fix(caching): remove all args from caching.py fix(caching): consistent function signatures + abc method * test(caching_unit_tests.py): add unit tests for llm caching ensures coverage for common caching scenarios across different implementations * refactor(litellm_logging.py): move to using cache key from hidden params instead of regenerating one * fix(router.py): drop redis password requirement * fix(proxy_server.py): fix faulty slack alerting check * fix(langfuse.py): avoid copying functions/thread lock objects in metadata fixes metadata copy error when parent otel span in metadata * test: update test * bump: version 1.52.5 → 1.52.6 * (feat) helm hook to sync db schema (#6715) * v0 migration job * fix job * fix migrations job.yml * handle standalone DB on helm hook * fix argo cd annotations * fix db migration helm hook * fix migration job * doc fix Using Http/2 with Hypercorn * (fix proxy redis) Add redis sentinel support (#6154) * add sentinel_password support * add doc for setting redis sentinel password * fix redis sentinel - use sentinel password * Fix: Update gpt-4o costs to that of gpt-4o-2024-08-06 (#6714) Fixes #6713 * (fix) using Anthropic `response_format={"type": "json_object"}` (#6721) * add support for response_format=json anthropic * add test_json_response_format to baseLLM ChatTest * fix test_litellm_anthropic_prompt_caching_tools * fix test_anthropic_function_call_with_no_schema * test test_create_json_tool_call_for_response_format * (feat) Add cost tracking for Azure Dall-e-3 Image Generation + use base class to ensure basic image generation tests pass (#6716) * add BaseImageGenTest * use 1 class for unit testing * add debugging to BaseImageGenTest * TestAzureOpenAIDalle3 * fix response_cost_calculator * test_basic_image_generation * fix img gen basic test * fix _select_model_name_for_cost_calc * fix test_aimage_generation_bedrock_with_optional_params * fix undo changes cost tracking * fix response_cost_calculator * fix test_cost_azure_gpt_35 * fix remove dup test (#6718) * (build) update db helm hook * (build) helm db pre sync hook * (build) helm db sync hook * test: run test_team_logging firdst --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Dinmukhamed Mailibay <47117969+dinmukhamedm@users.noreply.github.com> Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de> * test: update test * test: skip anthropic overloaded error * test: cleanup test * test: update tests * test: fix test * test: handle gemini overloaded model error * test: handle internal server error * test: handle anthropic overloaded error * test: handle claude instability --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Dinmukhamed Mailibay <47117969+dinmukhamedm@users.noreply.github.com> Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de> --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Jongseob Jeon <aiden.jongseob@gmail.com> Co-authored-by: Camden Clark <camdenaws@gmail.com> Co-authored-by: Rasswanth <61219215+IamRash-7@users.noreply.github.com> Co-authored-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com> Co-authored-by: Dinmukhamed Mailibay <47117969+dinmukhamedm@users.noreply.github.com> Co-authored-by: Kilian Lieret <kilian.lieret@posteo.de>	2024-11-19 09:54:50 +05:30
Ishaan Jaff	9d8c9f6e5e	(fix) Fix - don't allow `viewer` roles to create virtual keys (#6764 ) * fix ui route permissions * fix test_is_ui_route_allowed * fix test_is_ui_route_allowed * test_user_role_permissions	2024-11-15 18:02:13 -08:00
Krish Dholakia	2bf23b0c7d	LiteLLM Minor Fixes & Improvement (11/14/2024) (#6730 ) * fix(ollama.py): fix get model info request Fixes https://github.com/BerriAI/litellm/issues/6703 * feat(anthropic/chat/transformation.py): support passing user id to anthropic via openai 'user' param * docs(anthropic.md): document all supported openai params for anthropic * test: fix tests * fix: fix tests * feat(jina_ai/): add rerank support Closes https://github.com/BerriAI/litellm/issues/6691 * test: handle service unavailable error * fix(handler.py): refactor together ai rerank call * test: update test to handle overloaded error * test: fix test * Litellm router trace (#6742) * feat(router.py): add trace_id to parent functions - allows tracking retry/fallbacks * feat(router.py): log trace id across retry/fallback logic allows grouping llm logs for the same request * test: fix tests * fix: fix test * fix(transformation.py): only set non-none stop_sequences * Litellm router disable fallbacks (#6743) * bump: version 1.52.6 → 1.52.7 * feat(router.py): enable dynamically disabling fallbacks Allows for enabling/disabling fallbacks per key * feat(litellm_pre_call_utils.py): support setting 'disable_fallbacks' on litellm key * test: fix test * fix(exception_mapping_utils.py): map 'model is overloaded' to internal server error * test: handle gemini error * test: fix test * fix: new run	2024-11-15 01:02:54 +05:30
Ishaan Jaff	e47efd4e7d	fix raise correct error 404 when /key/info is called on non-existent key (#6653 ) * fix raise correct error on /key/info * add not_found_error error * fix key not found in DB error * use 1 helper for checking token hash * fix error code on key info * fix test key gen prisma * test_generate_and_call_key_info * test fix test_call_with_valid_model_using_all_models * fix key info tests	2024-11-11 21:00:39 -08:00
Krish Dholakia	1b553d36e5	Litellm dev 11 11 2024 (#6693 ) * fix(__init__.py): add 'watsonx_text' as mapped llm api route Fixes https://github.com/BerriAI/litellm/issues/6663 * fix(opentelemetry.py): fix passing parallel tool calls to otel Fixes https://github.com/BerriAI/litellm/issues/6677 * refactor(test_opentelemetry_unit_tests.py): create a base set of unit tests for all logging integrations - test for parallel tool call handling reduces bugs in repo * fix(__init__.py): update provider-model mapping to include all known provider-model mappings Fixes https://github.com/BerriAI/litellm/issues/6669 * feat(anthropic): support passing document in llm api call * docs(anthropic.md): add pdf anthropic call to docs + expose new 'supports_pdf_input' function * fix(factory.py): fix linting error	2024-11-12 00:16:35 +05:30
Krish Dholakia	7e4dfaa13f	Litellm dev 11 08 2024 (#6658 ) * fix(deepseek/chat): convert content list to str Fixes https://github.com/BerriAI/litellm/issues/6642 * test(test_deepseek_completion.py): implement base llm unit tests increase robustness across providers * fix(router.py): support content policy violation fallbacks with default fallbacks * fix(opentelemetry.py): refactor to move otel imports behing flag Fixes https://github.com/BerriAI/litellm/issues/6636 * fix(opentelemtry.py): close span on success completion * fix(user_api_key_auth.py): allow user_role to default to none * fix: mark flaky test * fix(opentelemetry.py): move otelconfig.from_env to inside the init prevent otel errors raised just by importing the litellm class * fix(user_api_key_auth.py): fix auth error	2024-11-08 22:07:17 +05:30
Ishaan Jaff	9d40609f90	(feat) log error class, function_name on prometheus service failure hook + only log DB related failures on DB service hook (#6650 ) * log error on prometheus service failure hook * use a more accurate function name for wrapper that handles logging db metrics * fix log_db_metrics * test_log_db_metrics_failure_error_types * fix linting * fix auth checks	2024-11-07 17:01:18 -08:00
Ishaan Jaff	c24a7c91c3	fix code quality check	2024-11-06 20:50:52 -08:00
Ishaan Jaff	2410f53f9f	(feat) Allow failed DB connection requests to allow virtual keys with `allow_failed_db_requests` (#6605 ) * fix use helper for _handle_failed_db_connection_for_get_key_object * track ALLOW_FAILED_DB_REQUESTS on prometheus * fix allow_failed_db_requests check * fix allow_requests_on_db_unavailable * fix allow_requests_on_db_unavailable * docs allow_requests_on_db_unavailable * identify user_id as litellm_proxy_admin_name when DB is failing * test_handle_failed_db_connection * fix test_user_api_key_auth_db_unavailable * update best practices for prod doc * update best practices for prod * fix handle db failure	2024-11-06 20:04:41 -08:00
Krish Dholakia	5557ae7dad	LiteLLM Minor Fixes & Improvements (11/06/2024) (#6624 ) * refactor(proxy_server.py): add debug logging around license check event (refactor position in startup_event logic) * fix(proxy/_types.py): allow admin_allowed_routes to be any str * fix(router.py): raise 400-status code error for no 'model_name' error on router Fixes issue with status code when unknown model name passed with pattern matching enabled * fix(converse_handler.py): add claude 3-5 haiku to bedrock converse models * test: update testing to replace claude-instant-1.2 * fix(router.py): fix router.moderation calls * test: update test to remove claude-instant-1 * fix(router.py): support model_list values in router.moderation * test: fix test * test: fix test	2024-11-07 04:37:32 +05:30
Ishaan Jaff	8d27efd9f2	fix allow using 15 seconds for premium license check	2024-11-04 16:09:15 -08:00
Krish Dholakia	e7ce45236a	Litellm perf improvements 3 (#6573 ) * perf: move writing key to cache, to background task * perf(litellm_pre_call_utils.py): add otel tracing for pre-call utils adds 200ms on calls with pgdb connected * fix(litellm_pre_call_utils.py'): rename call_type to actual call used * perf(proxy_server.py): remove db logic from _get_config_from_file was causing db calls to occur on every llm request, if team_id was set on key * fix(auth_checks.py): add check for reducing db calls if user/team id does not exist in db reduces latency/call by ~100ms * fix(proxy_server.py): minor fix on existing_settings not incl alerting * fix(exception_mapping_utils.py): map databricks exception string * fix(auth_checks.py): fix auth check logic * test: correctly mark flaky test * fix(utils.py): handle auth token error for tokenizers.from_pretrained	2024-11-05 03:51:26 +05:30
Krish Dholakia	cc19a9f6a1	Litellm dev 11 02 2024 (#6561 ) * fix(dual_cache.py): update in-memory check for redis batch get cache Fixes latency delay for async_batch_redis_cache * fix(service_logger.py): fix race condition causing otel service logging to be overwritten if service_callbacks set * feat(user_api_key_auth.py): add parent otel component for auth allows us to isolate how much latency is added by auth checks * perf(parallel_request_limiter.py): move async_set_cache_pipeline (from max parallel request limiter) out of execution path (background task) reduces latency by 200ms * feat(user_api_key_auth.py): have user api key auth object return user tpm/rpm limits - reduces redis calls in downstream task (parallel_request_limiter) Reduces latency by 400-800ms * fix(parallel_request_limiter.py): use batch get cache to reduce user/key/team usage object calls reduces latency by 50-100ms * fix: fix linting error * fix(_service_logger.py): fix import * fix(user_api_key_auth.py): fix service logging * fix(dual_cache.py): don't pass 'self' * fix: fix python3.8 error * fix: fix init]	2024-11-04 07:48:20 +05:30
Krish Dholakia	10e8d7aa45	Litellm router max depth (#6501 ) * feat(router.py): add check for max fallback depth Prevent infinite loop for fallbacks Closes https://github.com/BerriAI/litellm/issues/6498 * test: update test * (fix) Prometheus - Log Postgres DB latency, status on prometheus (#6484) * fix logging DB fails on prometheus * unit testing log to otel wrapper * unit testing for service logger + prometheus * use LATENCY buckets for service logging * fix service logging * docs clarify vertex vs gemini * (router_strategy/) ensure all async functions use async cache methods (#6489) * fix router strat * use async set / get cache in router_strategy * add coverage for router strategy * fix imports * fix batch_get_cache * use async methods for least busy * fix least busy use async methods * fix test_dual_cache_increment * test async_get_available_deployment when routing_strategy="least-busy" * (fix) proxy - fix when `STORE_MODEL_IN_DB` should be set (#6492) * set store_model_in_db at the top * correctly use store_model_in_db global * (fix) `PrometheusServicesLogger` `_get_metric` should return metric in Registry (#6486) * fix logging DB fails on prometheus * unit testing log to otel wrapper * unit testing for service logger + prometheus * use LATENCY buckets for service logging * fix service logging * fix _get_metric in prom services logger * add clear doc string * unit testing for prom service logger * bump: version 1.51.0 → 1.51.1 * Add `azure/gpt-4o-mini-2024-07-18` to model_prices_and_context_window.json (#6477) * Update utils.py (#6468) Fixed missing keys * (perf) Litellm redis router fix - ~100ms improvement (#6483) * docs(exception_mapping.md): add missing exception types Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183 * fix(main.py): register custom model pricing with specific key Ensure custom model pricing is registered to the specific model+provider key combination * test: make testing more robust for custom pricing * fix(redis_cache.py): instrument otel logging for sync redis calls ensures complete coverage for all redis cache calls * refactor: pass parent_otel_span for redis caching calls in router allows for more observability into what calls are causing latency issues * test: update tests with new params * refactor: ensure e2e otel tracing for router * refactor(router.py): add more otel tracing acrosss router catch all latency issues for router requests * fix: fix linting error * fix(router.py): fix linting error * fix: fix test * test: fix tests * fix(dual_cache.py): pass ttl to redis cache * fix: fix param * perf(cooldown_cache.py): improve cooldown cache, to store cache results in memory for 5s, prevents redis call from being made on each request reduces 100ms latency per call with caching enabled on router * fix: fix test * fix(cooldown_cache.py): handle if a result is None * fix(cooldown_cache.py): add debug statements * refactor(dual_cache.py): move to using an in-memory check for batch get cache, to prevent redis from being hit for every call * fix(cooldown_cache.py): fix linting erropr * build: merge main --------- Co-authored-by: Ishaan Jaff <ishaanjaffer0324@gmail.com> Co-authored-by: Xingyao Wang <xingyao@all-hands.dev> Co-authored-by: vibhanshu-ob <115142120+vibhanshu-ob@users.noreply.github.com>	2024-10-29 22:05:41 -07:00
Krish Dholakia	e712a2090b	redis otel tracing + async support for latency routing (#6452 ) * docs(exception_mapping.md): add missing exception types Fixes https://github.com/Aider-AI/aider/issues/2120#issuecomment-2438971183 * fix(main.py): register custom model pricing with specific key Ensure custom model pricing is registered to the specific model+provider key combination * test: make testing more robust for custom pricing * fix(redis_cache.py): instrument otel logging for sync redis calls ensures complete coverage for all redis cache calls * refactor: pass parent_otel_span for redis caching calls in router allows for more observability into what calls are causing latency issues * test: update tests with new params * refactor: ensure e2e otel tracing for router * refactor(router.py): add more otel tracing acrosss router catch all latency issues for router requests * fix: fix linting error * fix(router.py): fix linting error * fix: fix test * test: fix tests * fix(dual_cache.py): pass ttl to redis cache * fix: fix param	2024-10-28 21:52:12 -07:00
Ishaan Jaff	68a31bdfc1	test_is_ui_route_allowed	2024-10-25 10:37:11 +04:00
Ishaan Jaff	71559986d0	use helper for _route_matches_pattern	2024-10-25 10:31:21 +04:00

1 2 3 4 5 ...

252 commits