Hi Greg, As discussed with Alan, this is the backported patch for 4.9-stable. Please apply it to your queue. -- Regards Sudip
>From cd79fa8410e3002e11d71eb785ff11ceb9737b17 Mon Sep 17 00:00:00 2001 From: Alan Jenkins <alan.christopher.jenkins@xxxxxxxxx> Date: Thu, 12 Apr 2018 19:11:58 +0100 Subject: [PATCH] block: do not use interruptible wait anywhere commit 1dc3039bc87ae7d19a990c3ee71cfd8a9068f428 upstream When blk_queue_enter() waits for a queue to unfreeze, or unset the PREEMPT_ONLY flag, do not allow it to be interrupted by a signal. The PREEMPT_ONLY flag was introduced later in commit 3a0a529971ec ("block, scsi: Make SCSI quiesce and resume work reliably"). Note the SCSI device is resumed asynchronously, i.e. after un-freezing userspace tasks. So that commit exposed the bug as a regression in v4.15. A mysterious SIGBUS (or -EIO) sometimes happened during the time the device was being resumed. Most frequently, there was no kernel log message, and we saw Xorg or Xwayland killed by SIGBUS.[1] [1] E.g. https://bugzilla.redhat.com/show_bug.cgi?id=1553979 Without this fix, I get an IO error in this test: while killall -SIGUSR1 dd; do sleep 0.1; done & \ echo mem > /sys/power/state ; \ sleep 5; killall dd # stop after 5 seconds The interruptible wait was added to blk_queue_enter in commit 3ef28e83ab15 ("block: generic request_queue reference counting"). Before then, the interruptible wait was only in blk-mq, but I don't think it could ever have been correct. Reviewed-by: Bart Van Assche <bart.vanassche@xxxxxxx> Cc: stable@xxxxxxxxxxxxxxx Signed-off-by: Alan Jenkins <alan.christopher.jenkins@xxxxxxxxx> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@xxxxxxxxx> --- block/blk-core.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 23daf40be371..77b99bf16c83 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -652,7 +652,6 @@ EXPORT_SYMBOL(blk_alloc_queue); int blk_queue_enter(struct request_queue *q, bool nowait) { while (true) { - int ret; if (percpu_ref_tryget_live(&q->q_usage_counter)) return 0; @@ -660,13 +659,11 @@ int blk_queue_enter(struct request_queue *q, bool nowait) if (nowait) return -EBUSY; - ret = wait_event_interruptible(q->mq_freeze_wq, - !atomic_read(&q->mq_freeze_depth) || - blk_queue_dying(q)); + wait_event(q->mq_freeze_wq, + !atomic_read(&q->mq_freeze_depth) || + blk_queue_dying(q)); if (blk_queue_dying(q)) return -ENODEV; - if (ret) - return ret; } } -- 2.11.0