This is a note to let you know that I've just added the patch titled drm/nouveau: Fix deadlock on runtime suspend to the 4.4-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: drm-nouveau-fix-deadlock-on-runtime-suspend.patch and it can be found in the queue-4.4 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From d61a5c1063515e855bedb1b81e20e50b0ac3541e Mon Sep 17 00:00:00 2001 From: Lukas Wunner <lukas@xxxxxxxxx> Date: Sun, 11 Feb 2018 10:38:28 +0100 Subject: drm/nouveau: Fix deadlock on runtime suspend From: Lukas Wunner <lukas@xxxxxxxxx> commit d61a5c1063515e855bedb1b81e20e50b0ac3541e upstream. nouveau's ->runtime_suspend hook calls drm_kms_helper_poll_disable(), which waits for the output poll worker to finish if it's running. The output poll worker meanwhile calls pm_runtime_get_sync() in nouveau_connector_detect() which waits for the ongoing suspend to finish, causing a deadlock. Fix by not acquiring a runtime PM ref if nouveau_connector_detect() is called in the output poll worker's context. This is safe because the poll worker is only enabled while runtime active and we know that ->runtime_suspend waits for it to finish. Other contexts calling nouveau_connector_detect() do require a runtime PM ref, these comprise: status_store() drm sysfs interface ->fill_modes drm callback drm_fb_helper_probe_connector_modes() drm_mode_getconnector() nouveau_connector_hotplug() nouveau_display_hpd_work() nv17_tv_set_property() Stack trace for posterity: INFO: task kworker/0:1:58 blocked for more than 120 seconds. Workqueue: events output_poll_execute [drm_kms_helper] Call Trace: schedule+0x28/0x80 rpm_resume+0x107/0x6e0 __pm_runtime_resume+0x47/0x70 nouveau_connector_detect+0x7e/0x4a0 [nouveau] nouveau_connector_detect_lvds+0x132/0x180 [nouveau] drm_helper_probe_detect_ctx+0x85/0xd0 [drm_kms_helper] output_poll_execute+0x11e/0x1c0 [drm_kms_helper] process_one_work+0x184/0x380 worker_thread+0x2e/0x390 INFO: task kworker/0:2:252 blocked for more than 120 seconds. Workqueue: pm pm_runtime_work Call Trace: schedule+0x28/0x80 schedule_timeout+0x1e3/0x370 wait_for_completion+0x123/0x190 flush_work+0x142/0x1c0 nouveau_pmops_runtime_suspend+0x7e/0xd0 [nouveau] pci_pm_runtime_suspend+0x5c/0x180 vga_switcheroo_runtime_suspend+0x1e/0xa0 __rpm_callback+0xc1/0x200 rpm_callback+0x1f/0x70 rpm_suspend+0x13c/0x640 pm_runtime_work+0x6e/0x90 process_one_work+0x184/0x380 worker_thread+0x2e/0x390 Bugzilla: https://bugs.archlinux.org/task/53497 Bugzilla: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=870523 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70388#c33 Fixes: 5addcf0a5f0f ("nouveau: add runtime PM support (v0.9)") Cc: stable@xxxxxxxxxxxxxxx # v3.12+: 27d4ee03078a: workqueue: Allow retrieval of current task's work struct Cc: stable@xxxxxxxxxxxxxxx # v3.12+: 25c058ccaf2e: drm: Allow determining if current task is output poll worker Cc: Ben Skeggs <bskeggs@xxxxxxxxxx> Cc: Dave Airlie <airlied@xxxxxxxxxx> Reviewed-by: Lyude Paul <lyude@xxxxxxxxxx> Signed-off-by: Lukas Wunner <lukas@xxxxxxxxx> Link: https://patchwork.freedesktop.org/patch/msgid/b7d2cbb609a80f59ccabfdf479b9d5907c603ea1.1518338789.git.lukas@xxxxxxxxx Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/gpu/drm/nouveau/nouveau_connector.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-) --- a/drivers/gpu/drm/nouveau/nouveau_connector.c +++ b/drivers/gpu/drm/nouveau/nouveau_connector.c @@ -253,9 +253,15 @@ nouveau_connector_detect(struct drm_conn nv_connector->edid = NULL; } - ret = pm_runtime_get_sync(connector->dev->dev); - if (ret < 0 && ret != -EACCES) - return conn_status; + /* Outputs are only polled while runtime active, so acquiring a + * runtime PM ref here is unnecessary (and would deadlock upon + * runtime suspend because it waits for polling to finish). + */ + if (!drm_kms_helper_is_poll_worker()) { + ret = pm_runtime_get_sync(connector->dev->dev); + if (ret < 0 && ret != -EACCES) + return conn_status; + } nv_encoder = nouveau_connector_ddc_detect(connector); if (nv_encoder && (i2c = nv_encoder->i2c) != NULL) { @@ -323,8 +329,10 @@ detect_analog: out: - pm_runtime_mark_last_busy(connector->dev->dev); - pm_runtime_put_autosuspend(connector->dev->dev); + if (!drm_kms_helper_is_poll_worker()) { + pm_runtime_mark_last_busy(connector->dev->dev); + pm_runtime_put_autosuspend(connector->dev->dev); + } return conn_status; } Patches currently in stable-queue which might be from lukas@xxxxxxxxx are queue-4.4/workqueue-allow-retrieval-of-current-task-s-work-struct.patch queue-4.4/drm-radeon-fix-deadlock-on-runtime-suspend.patch queue-4.4/drm-amdgpu-fix-deadlock-on-runtime-suspend.patch queue-4.4/drm-allow-determining-if-current-task-is-output-poll-worker.patch queue-4.4/drm-nouveau-fix-deadlock-on-runtime-suspend.patch