litellm-mirror

mirror of https://github.com/BerriAI/litellm.git synced 2025-04-27 19:54:13 +00:00

Author	SHA1	Message	Date
Krish Dholakia	c8aa876785	fix(proxy_server.py): fix get model info when litellm_model_id is set + move model analytics to free (#7886 ) * fix(proxy_server.py): fix get model info when litellm_model_id is set Fixes https://github.com/BerriAI/litellm/issues/7873 * test(test_models.py): add test to ensure get model info on specific deployment has same value as all model info Fixes https://github.com/BerriAI/litellm/issues/7873 * fix(usage.tsx): make model analytics free Fixes @iqballx's feedback * fix(fix(invoke_handler.py):-fix-bedrock-error-chunk-parsing): return correct bedrock status code and error message if chunk in stream Improves bedrock stream error handling * fix(proxy_server.py): fix linting errors * test(test_auth_checks.py): remove redundant test * fix(proxy_server.py): fix linting errors * test: fix flaky test * test: fix test	2025-01-21 08:19:07 -08:00
Ishaan Jaff	0295f494b6	(e2e testing + minor refactor) - Virtual Key Max budget check (#7888 ) * use helper _virtual_key_max_budget_check * e2e testing for budget exceeded errors * e2e budget testing * test_chat_completion_budget_update * test_chat_completion_high_budget	2025-01-21 06:47:26 -08:00
Krish Dholakia	64e1df1f14	Litellm dev 01 20 2025 p3 (#7890 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 13s Details * fix(router.py): pass stream timeout correctly for non openai / azure models Fixes https://github.com/BerriAI/litellm/issues/7870 * test(test_router_timeout.py): add test for streaming * test(test_router_timeout.py): add unit testing for new router functions * docs(ollama.md): link to section on calling ollama within docker container * test: remove redundant test * test: fix test to include timeout value * docs(config_settings.md): document new router settings param	2025-01-20 21:46:36 -08:00
Krish Dholakia	4b23420a20	Litellm dev 01 20 2025 p1 (#7884 ) * fix(initial-test-to-return-api-timeout-value-in-openai-timeout-exception): Makes it easier for user to debug why request timed out * feat(openai.py): return timeout value + time taken on openai timeout errors helps debug timeout errors * fix(utils.py): fix num retries extraction logic when num_retries = 0 * fix(config_settings.md): litellm_logging.py support printing payload to console if 'LITELLM_PRINT_STANDARD_LOGGING_PAYLOAD' is true Enables easier debug * test(test_auth_checks.py'): remove common checks userapikeyauth enforcement check * fix(litellm_logging.py): fix linting error	2025-01-20 21:45:48 -08:00
Ishaan Jaff	806df5d31c	(Feat) `datadog_llm_observability` callback - emit `request_tags` on logs (#7883 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 13s Details * dd - emit tags on llm obs payload * dd - show requester tags on traces * test_get_datadog_tags * _get_datadog_tags * fix dd POD_NAME * test_get_datadog_tags	2025-01-20 20:36:27 -08:00
Krish Dholakia	4b88635372	fix(fireworks_ai/): fix global disable flag with transform messages helper (#7847 ) fixes issue where .get() = none was preventing global disable flag from being picked up	2025-01-20 20:16:11 -08:00
Krish Dholakia	dca6904937	JWT Auth - `enforce_rbac` support + UI team view, spend calc fix (#7863 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 12s Details * fix(user_dashboard.tsx): fix spend calculation when team selected sum all team keys, not user keys * docs(admin_ui_sso.md): fix docs tabbing * feat(user_api_key_auth.py): introduce new 'enforce_rbac' param on jwt auth allows proxy admin to prevent any unmapped yet authenticated jwt tokens from calling proxy Fixes https://github.com/BerriAI/litellm/issues/6793 * test: more unit testing + refactoring * fix: fix returning id when obj not found in db * fix(user_api_key_auth.py): add end user id tracking from jwt auth * docs(token_auth.md): add doc on rbac with JWTs * fix: fix unused params * test: remove old test	2025-01-19 21:28:55 -08:00
Krish Dholakia	c306c2e0fc	Auth checks on invalid fallback models (#7871 ) * fix(user_api_key_auth.py): handle clientside fallback model when item in list is dictionary * fix(auth_checks.py): help user find invalid model names during dev Ensure fallbacks work in prod * fix(user_api_key_auth.py): fix linting check * fix: cleanup unused variables * fix: fix import * fix(auth_checks.py): fix auth check	2025-01-19 21:28:10 -08:00
Krish Dholakia	3a7b13efa2	feat(health_check.py): set upperbound for api when making health check call (#7865 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 10s Details * feat(health_check.py): set upperbound for api when making health check call prevent bad model from health check to hang and cause pod restarts * fix(health_check.py): cleanup task once completed * fix(constants.py): bump default health check timeout to 1min * docs(health.md): add 'health_check_timeout' to health docs on litellm * build(proxy_server_config.yaml): add bad model to health check	2025-01-18 19:47:43 -08:00
Krish Dholakia	e67f18b153	LiteLLM Minor Fixes & Improvements (01/18/2025) - p1 (#7857 ) * OllamaChatConfig supports JSON schema response format in optional parameters (#7832) * fix(types/router.py): handle none values for bool types Fixes https://github.com/BerriAI/litellm/issues/7855#issuecomment-2599781974 * test: handle no hf token in env --------- Co-authored-by: trislaz <35226192+trislaz@users.noreply.github.com>	2025-01-18 19:03:50 -08:00
Ishaan Jaff	2fdbcca9ae	e2e ui testing fixes	2025-01-18 07:46:55 -08:00
Krish Dholakia	1bea338597	LiteLLM Minor Fixes & Improvements (2024/16/01) (#7826 ) * fix(lm_studio/chat/transformation.py): Fix https://github.com/BerriAI/litellm/issues/7811 * fix(router.py): fix mock timeout check * fix: drop model name from fallback args since it causes a conflict with the model=model that is provided later on. (#7806) This error happens if you provide multiple fallback models to the completion function with model name defined in each one. * fix(router.py): remove mock_timeout before sending to request prevents reuse in fallbacks * test: update test * test: revert test change - wrong pr --------- Co-authored-by: Dudu Lasry <david1542@users.noreply.github.com>	2025-01-17 20:59:21 -08:00
Krish Dholakia	80f7af510b	Improve Proxy Resiliency: Cooldown single-deployment model groups if 100% calls failed in high traffic (#7823 ) * refactor(_is_cooldown_required): move '_is_cooldown_required' into cooldown_handlers.py * refactor(cooldown_handlers.py): move cooldown constants into `.constants.py` * fix(cooldown_handlers.py): remove if single deployment don't cooldown logic move to traffic based cooldown logic Addresses https://github.com/BerriAI/litellm/issues/7822 * fix: add unit tests for '_should_cooldown_deployment' * test: ensure all tests pass * test: update test * fix(cooldown_handlers.py): don't cooldown single deployment models for anything besides traffic related errors * fix(cooldown_handlers.py): fix cooldown handler logic * fix(cooldown_handlers.py): fix check	2025-01-17 20:17:02 -08:00
Krish Dholakia	c4ff0b6487	refactor: make bedrock image transformation requests async (#7840 ) * refactor: initial commit for using separate sync vs. async transformation routes for bedrock ensures no blocking calls e.g. when converting image url to b64 * perf(converse_transformation.py): make bedrock converse transformation async asyncify's the bedrock message transformation - useful for handling image urls for bedrock * fix(converse_handler.py): fix logging for async streaming * style: cleanup unused imports	2025-01-17 20:14:15 -08:00
Krish Dholakia	71c41f8f33	QA: ensure all bedrock regional models have same `supported_` as base + Anthropic nested pydantic object support (#7844 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 13s Details * build: ensure all regional bedrock models have same supported values as base bedrock model prevents drift * test(base_llm_unit_tests.py): add testing for nested pydantic objects * fix(test_utils.py): add test_get_potential_model_names * fix(anthropic/chat/transformation.py): support nested pydantic objects Fixes https://github.com/BerriAI/litellm/issues/7755	2025-01-17 19:49:12 -08:00
Ishaan Jaff	2c117264a2	[Hashicorp - secret manager] - use vault namespace for tls auth (#7834 ) * hcorp - use x-vault-namespace * _get_tls_cert_auth_body * HCP_VAULT_CERT_ROLE * test_hashicorp_secret_manager_tls_cert_auth * HCP_VAULT_CERT_ROLE	2025-01-17 19:27:56 -08:00
Ishaan Jaff	d3c2f4331a	(UI - View SpendLogs Table) (#7842 ) * litellm log messages / responses * add messages/response to schema.prisma * add support for logging messages / responses in DB * test_spend_logs_payload_with_prompts_enabled * _get_messages_for_spend_logs_payload * ui_view_spend_logs endpoint * add tanstack and moment * add uiSpendLogsCall * ui view logs table * ui view spendLogs table * ui_view_spend_logs * fix code quality * test_spend_logs_payload_with_prompts_enabled * _get_messages_for_spend_logs_payload * test_spend_logs_payload_with_prompts_enabled * test_spend_logs_payload_with_prompts_enabled * ui view spend logs * minor ui fix * ui - update leftnav * ui - clean up ui * fix leftnav * ui fix navbar * ui fix moving chat ui tab	2025-01-17 18:53:45 -08:00
yujonglee	7584369fbe	add key and team level budget (#7831 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 13s Details	2025-01-17 09:04:12 -08:00
Ishaan Jaff	632ba92af1	Revert "fix: fix test" This reverts commit `0642a78abb`.	2025-01-17 07:21:19 -08:00
Ishaan Jaff	b30e05b54f	Revert "test_completion_mistral_api_mistral_large_function_call" This reverts commit `ef9177f0a8`.	2025-01-17 07:20:46 -08:00
Krrish Dholakia	0642a78abb	fix: fix test	2025-01-17 07:16:32 -08:00
Ishaan Jaff	c8febaca2e	test_watsonx_token_in_env_var	2025-01-16 22:28:37 -08:00
Ishaan Jaff	7f63e7c15a	test_completion_mistral_api_mistral_large_function_call	2025-01-16 22:27:48 -08:00
Ishaan Jaff	b492551d3d	(fix) IBM Watsonx using ZenApiKey (#7821 ) * ibm watsonx fix * test ZenAPIKey * fix zenapikey	2025-01-16 22:02:36 -08:00
Ishaan Jaff	5b36985c00	run ci/cd again	2025-01-16 22:02:03 -08:00
Ishaan Jaff	ef9177f0a8	test_completion_mistral_api_mistral_large_function_call	2025-01-16 21:50:56 -08:00
Ishaan Jaff	117256d264	test_async_vertexai_streaming_response	2025-01-16 21:45:12 -08:00
Ishaan Jaff	5458a2ff33	fireworks ai use llama-v3p1-8b-instruct	2025-01-16 21:28:44 -08:00
Krrish Dholakia	8ab1335ae0	test: fix unit test	2025-01-16 21:11:17 -08:00
Ishaan Jaff	2f38e72026	test commit on main	2025-01-16 20:52:55 -08:00
Krish Dholakia	c57266c9dc	test: initial commit enforcing testing on all anthropic pass through … (#7794 ) * test: initial commit enforcing testing on all anthropic pass through functions prevents future regressions * test(test_unit_test_anthropic_pass_through.py): add unit test for '_get_user_from_metadata' function * test(test_unit_test_anthropic_passthrough.py): add unit test for handle_logging_anthropic_collected_chunks * test(test_unit_test_anthropic_pass_through): add coverage for all anthropic pass through functions	2025-01-15 22:02:35 -08:00
Krish Dholakia	843cd3b7c6	test: initial test to enforce all functions in user_api_key_auth.py h… (#7797 ) * test: initial test to enforce all functions in user_api_key_auth.py have direct testing * test(test_user_api_key_auth.py): add is_allowed_route unit test * test(test_user_api_key_auth.py): add more tests * test(test_user_api_key_auth.py): add complete testing coverage for all functions in `user_api_key_auth.py` * test(test_db_schema_changes.py): add a unit test to ensure all db schema changes are backwards compatible gives user an easy rollback path * test: fix schema compatibility test filepath * test: fix test	2025-01-15 21:52:45 -08:00
Krish Dholakia	80d6bbec29	Litellm dev 01 14 2025 p2 (#7772 ) * feat(pass_through_endpoints.py): fix anthropic end user cost tracking * fix(anthropic/chat/transformation.py): use returned provider model for anthropic handles anthropic `-latest` tag in request body throwing cost calculation errors ensures we can be accurate in our model cost tracking * feat(model_prices_and_context_window.json): add gemini-2.0-flash-thinking-exp pricing * test: update test to use assumption that user_api_key_dict can get anthropic user id * test: fix test * fix: fix test * fix(anthropic_pass_through.py): uncomment previous anthropic end-user cost tracking code block can't guarantee user api key dict always has end user id - too many code paths * fix(user_api_key_auth.py): this allows end user id from request body to always be read and set in auth object * fix(auth_check.py): fix linting error * test: fix auth check * fix(auth_utils.py): fix get end user id to handle metadata = None	2025-01-15 21:34:50 -08:00
Krish Dholakia	fe60a38c8e	Litellm dev 01 2025 p4 (#7776 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 13s Details * fix(gemini/): support gemini 'frequency_penalty' and 'presence_penalty' Closes https://github.com/BerriAI/litellm/issues/7748 * feat(proxy_server.py): new env var to disable prisma health check on startup * test: fix test	2025-01-14 21:49:25 -08:00
Krish Dholakia	8353caa485	build(pyproject.toml): bump uvicorn depedency requirement (#7773 ) * build(pyproject.toml): bump uvicorn depedency requirement Fixes https://github.com/BerriAI/litellm/issues/7768 * fix(anthropic/chat/transformation.py): fix is_vertex_request check to actually use optional param passed in Fixes https://github.com/BerriAI/litellm/issues/6898#issuecomment-2590860695 * fix(o1_transformation.py): fix azure o1 'is_o1_model' check to just check for o1 in model string https://github.com/BerriAI/litellm/issues/7743 * test: load vertex creds	2025-01-14 21:47:11 -08:00
Ishaan Jaff	30bb4c4cdd	(fix) `BaseAWSLLM` - cache IAM role credentials when used (#7775 ) * fix base aws llm * fix auth with aws role * test aws base llm * fix base aws llm init * run ci/cd again * fix get_credentials * ci/cd run again * _auth_with_aws_role	2025-01-14 20:16:22 -08:00
Ishaan Jaff	5fbbf47581	(Feat) prometheus - emit remaining team budget metric on proxy startup (#7777 ) * fix get_paginated_teams * use _initialize_remaining_budget_metrics * fix prom metric * run ci/cd again * fix run async func * fix _initialize_prometheus_startup_metrics * fix _initialize_prometheus_startup_metrics * prom unit tests * test_get_paginated_teams	2025-01-14 20:08:23 -08:00
Krish Dholakia	35919d9fec	Litellm dev 01 13 2025 p2 (#7758 ) All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 12s Details * fix(factory.py): fix bedrock document url check Make check more generic - if starts with 'text' or 'application' assume it's a document and let it go through Fixes https://github.com/BerriAI/litellm/issues/7746 * feat(key_management_endpoints.py): support writing new key alias to aws secret manager - on key rotation adds rotation endpoint to aws key management hook - allows for rotated litellm virtual keys with new key alias to be written to it * feat(key_management_event_hooks.py): support rotating keys and updating secret manager * refactor(base_secret_manager.py): support rotate secret at the base level since it's just an abstraction function, it's easy to implement at the base manager level * style: cleanup unused imports	2025-01-14 17:04:01 -08:00
Krish Dholakia	7b27cfb0ae	Support temporary budget increases on keys (#7754 ) * fix(gpt_transformation.py): fix response_format translation check for 4o models Fixes https://github.com/BerriAI/litellm/issues/7616 * feat(key_management_endpoints.py): support 'temp_budget_increase' and 'temp_budget_expiry' fields Allow proxy admin to grant temporary budget increases to keys * fix(proxy/_types.py): enforce temp_budget_increase and temp_budget_expiry are always passed together * feat(user_api_key_auth.py): initial working temp budget increase logic ensures key budget exceeded error checks for temp budget in key metadata * feat(proxy_server.py): return the key max budget and key spend in the response headers Allows clientside user to know their remaining limits * test: add unit testing for new proxy utils Ensures new key budget is correctly handled * docs(temporary_budget_increase.md): add doc on temporary budget increase * fix(utils.py): remove 3.5 from response_format check for now not all azure 3.5 models support response_format * fix(user_api_key_auth.py): return valid user api key auth object on all paths	2025-01-14 17:03:11 -08:00
Krish Dholakia	29663c2db5	Litellm dev 01 14 2025 p1 (#7771 ) * First-class Aim Guardrails support (#7738) * initial aim support * add tests * docs(langsmith_integration.md): cleanup * style: cleanup unused imports --------- Co-authored-by: Tomer Bin <117278227+hxtomer@users.noreply.github.com>	2025-01-14 16:18:21 -08:00
Ishaan Jaff	d510f1d517	(fix) health check - allow setting `health_check_model` (#7752 ) * use _update_litellm_params_for_health_check * fix Wildcard Routes * test_update_litellm_params_for_health_check * test_perform_health_check_with_health_check_model * fix doc string * huggingface/mistralai/Mistral-7B-Instruct-v0.3	2025-01-13 20:16:44 -08:00
Ishaan Jaff	9daa6fb0b4	(prometheus - minor bug fix) - `litellm_llm_api_time_to_first_token_metric` not populating for bedrock models (#7740 ) * fix prometheus ttft * fix test_set_latency_metrics * fix _set_latency_metrics * fix _set_latency_metrics * fix test_set_latency_metrics * test_async_log_success_event * huggingface/mistralai/Mistral-7B-Instruct-v0.3	2025-01-13 20:16:34 -08:00
Ishaan Jaff	f1335362cf	(core sdk fix) - fix fallbacks stuck in infinite loop (#7751 ) * test_acompletion_fallbacks_basic * use common run_async_function * fix completion_with_fallbacks * fix completion with fallbacks * fix fallback utils * test_acompletion_fallbacks_basic * test_completion_fallbacks_sync * huggingface/mistralai/Mistral-7B-Instruct-v0.3	2025-01-13 19:34:34 -08:00
Ishaan Jaff	970e9c7507	huggingface/mistralai/Mistral-7B-Instruct-v0.3	2025-01-13 18:42:36 -08:00
Ishaan Jaff	3fe1f3b3b2	test_team_access_groups All checks were successful Read Version from pyproject.toml / read-version (push) Successful in 12s Details	2025-01-12 22:26:13 -08:00
Krish Dholakia	ec5a354eac	add azure o1 pricing (#7715 ) * build(model_prices_and_context_window.json): add azure o1 pricing Closes https://github.com/BerriAI/litellm/issues/7712 * refactor: replace regex with string method for whitespace check in stop-sequences handling (#7713) * Allows overriding keep_alive time in ollama (#7079) * Allows overriding keep_alive time in ollama * Also adds to ollama_chat * Adds some info on the docs about this parameter * fix: together ai warning (#7688) Co-authored-by: Carl Senze <carl.senze@aleph-alpha.com> * fix(proxy_server.py): handle config containing thread locked objects when using get_config_state * fix(proxy_server.py): add exception to debug * build(model_prices_and_context_window.json): update 'supports_vision' for azure o1 --------- Co-authored-by: Wolfram Ravenwolf <52386626+WolframRavenwolf@users.noreply.github.com> Co-authored-by: Regis David Souza Mesquita <github@rdsm.dev> Co-authored-by: Carl <45709281+capsenz@users.noreply.github.com> Co-authored-by: Carl Senze <carl.senze@aleph-alpha.com>	2025-01-12 18:15:35 -08:00
Ishaan Jaff	15b52039d2	(litellm sdk speedup router) - adds a helper `_cached_get_model_group_info` to use when trying to get deployment tpm/rpm limits (#7719 ) * fix _cached_get_model_group_info * fixes get_remaining_model_group_usage * test_cached_get_model_group_info	2025-01-12 15:14:54 -08:00
Krish Dholakia	ad2f66b3e3	[BETA] Add OpenAI `/images/variations` + Topaz API support (#7700 ) * feat(main.py): initial commit for `/image/variations` endpoint support * refactor(base_llm/): introduce new base llm base config for image variation endpoints * refactor(openai/image_variations/transformation.py): implement openai image variation transformation handler * fix: test * feat(openai/): working openai `/image/variation` endpoint calls via sdk * feat(topaz/): topaz sync image variation call support Addresses https://github.com/BerriAI/litellm/issues/7593 ' * fix(topaz/transformation.py): fix linting errors * fix(openai/image_variations/handler.py): fix passing json data * fix(main.py): image_variation/ support async image variation route - `aimage_variation` * fix(test_get_model_info.py): fix test * fix: cleanup unused imports * feat(openai/): add async `/image/variations` endpoint support * feat(topaz/): support async `/image/variations` calls * fix: test * fix(utils.py): fix get_model_info_helper for no model info w/ provider config handles situation where model info is not known but provider config exists * test(test_router_fallbacks.py): mark flaky test * fix: fix unused imports * test: bump otel load test perf threshold - accounts for current load tests hitting same server	2025-01-11 23:27:46 -08:00
Krish Dholakia	becd4bc748	Litellm dev 01 11 2025 p3 (#7702 ) * fix(__init__.py): fix init to exclude pricing-only model cost values from real model names prevents bad health checks on wildcard routes * fix(get_llm_provider.py): fix to handle calling bedrock_converse models	2025-01-11 20:06:54 -08:00
Krish Dholakia	27892acdfc	Litellm dev 01 10 2025 p3 (#7682 ) * feat(langfuse.py): log the used prompt when prompt management used * test: fix test * docs(self_serve.md): add doc on restricting personal key creation on ui * feat(s3.py): support s3 logging with team alias prefixes (if available) New preview feature * fix(main.py): remove old if block - simplify to just await if coroutine returned fixes lm_studio async embedding error * fix(langfuse.py): handle get prompt check	2025-01-10 21:56:42 -08:00

... 15 16 17 18 19 ...

1610 commits