Re: [PATCH V2] SCSI: fix queue cleanup race before queue initialization is done

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/14/18 1:25 AM, Ming Lei wrote:
> c2856ae2f315d ("blk-mq: quiesce queue before freeing queue") has
> already fixed this race, however the implied synchronize_rcu()
> in blk_mq_quiesce_queue() can slow down LUN probe a lot, so caused
> performance regression.
> 
> Then 1311326cf4755c7 ("blk-mq: avoid to synchronize rcu inside blk_cleanup_queue()")
> tried to quiesce queue for avoiding unnecessary synchronize_rcu()
> only when queue initialization is done, because it is usual to see
> lots of inexistent LUNs which need to be probed.
> 
> However, turns out it isn't safe to quiesce queue only when queue
> initialization is done. Because when one SCSI command is completed,
> the user of sending command can be waken up immediately, then the
> scsi device may be removed, meantime the run queue in scsi_end_request()
> is still in-progress, so kernel panic can be caused.
> 
> In Red Hat QE lab, there are several reports about this kind of kernel
> panic triggered during kernel booting.
> 
> This patch tries to address the issue by grabing one queue usage
> counter during freeing one request and the following run queue.

Thanks applied, this bug was elusive but ever present in recent
testing that we did internally, it's been a huge pain in the butt.
The symptoms were usually a crash in blk_mq_get_driver_tag() with
hctx->tags == NULL, or a crash inside deadline request insert off
requeue.

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux