Sent a v2 with correct subject tags On 2018-06-14 10:53:18 , Khalid Elmously wrote: > From: Alan Jenkins <alan.christopher.jenkins@xxxxxxxxx> > > BugLink: http://bugs.launchpad.net/bugs/1776887 > > When blk_queue_enter() waits for a queue to unfreeze, or unset the > PREEMPT_ONLY flag, do not allow it to be interrupted by a signal. > > The PREEMPT_ONLY flag was introduced later in commit 3a0a529971ec > ("block, scsi: Make SCSI quiesce and resume work reliably"). Note the SCSI > device is resumed asynchronously, i.e. after un-freezing userspace tasks. > > So that commit exposed the bug as a regression in v4.15. A mysterious > SIGBUS (or -EIO) sometimes happened during the time the device was being > resumed. Most frequently, there was no kernel log message, and we saw Xorg > or Xwayland killed by SIGBUS.[1] > > [1] E.g. https://bugzilla.redhat.com/show_bug.cgi?id=1553979 > > Without this fix, I get an IO error in this test: > > # dd if=/dev/sda of=/dev/null iflag=direct & \ > while killall -SIGUSR1 dd; do sleep 0.1; done & \ > echo mem > /sys/power/state ; \ > sleep 5; killall dd # stop after 5 seconds > > The interruptible wait was added to blk_queue_enter in > commit 3ef28e83ab15 ("block: generic request_queue reference counting"). > Before then, the interruptible wait was only in blk-mq, but I don't think > it could ever have been correct. > > Reviewed-by: Bart Van Assche <bart.vanassche@xxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Alan Jenkins <alan.christopher.jenkins@xxxxxxxxx> > Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> > (cherry-picked from 1dc3039bc87ae7d19a990c3ee71cfd8a9068f428) > Signed-off-by: Khalid Elmously <khalid.elmously@xxxxxxxxxxxxx> > > --- > block/blk-core.c | 11 ++++------- > 1 file changed, 4 insertions(+), 7 deletions(-) > > diff --git a/block/blk-core.c b/block/blk-core.c > index fc0666354af3..59c91e345eea 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -821,7 +821,6 @@ int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flags) > > while (true) { > bool success = false; > - int ret; > > rcu_read_lock(); > if (percpu_ref_tryget_live(&q->q_usage_counter)) { > @@ -853,14 +852,12 @@ int blk_queue_enter(struct request_queue *q, blk_mq_req_flags_t flags) > */ > smp_rmb(); > > - ret = wait_event_interruptible(q->mq_freeze_wq, > - (atomic_read(&q->mq_freeze_depth) == 0 && > - (preempt || !blk_queue_preempt_only(q))) || > - blk_queue_dying(q)); > + wait_event(q->mq_freeze_wq, > + (atomic_read(&q->mq_freeze_depth) == 0 && > + (preempt || !blk_queue_preempt_only(q))) || > + blk_queue_dying(q)); > if (blk_queue_dying(q)) > return -ENODEV; > - if (ret) > - return ret; > } > } > > -- > 2.17.1 >