On Wed, Jun 01, 2022 at 08:43:29AM +0200, Christoph Hellwig wrote: > On Wed, Jun 01, 2022 at 08:54:30AM +0800, Ming Lei wrote: > > This way can't be safe, who can guarantee that all sync submission > > activities are gone after queue is frozen? We had lots of reports on > > blk_mq_sched_has_work() which triggers UAF. > > Yes, we probably need a blk_mq_quiesce_queue call like in the incremental > patch below. Do you have any good reproducer, though? blktests block/027 should cover this. > > diff --git a/block/genhd.c b/block/genhd.c > index 9914d0f24fecd..155b64ff991f6 100644 > --- a/block/genhd.c > +++ b/block/genhd.c > @@ -652,9 +652,13 @@ void del_gendisk(struct gendisk *disk) > blk_mq_cancel_work_sync(q); > > if (q->elevator) { > + blk_mq_quiesce_queue(q); > + > mutex_lock(&q->sysfs_lock); > elevator_exit(q); > mutex_unlock(&q->sysfs_lock); > + > + blk_mq_unquiesce_queue(q); > } > I am afraid the above way may slow down disk shutdown a lot, see the following commit, that is also the reason why I moved it into disk release handler, when any sync io submission are done. commit 1311326cf4755c7ffefd20f576144ecf46d9906b Author: Ming Lei <ming.lei@xxxxxxxxxx> Date: Mon Jun 25 19:31:49 2018 +0800 blk-mq: avoid to synchronize rcu inside blk_cleanup_queue() SCSI probing may synchronously create and destroy a lot of request_queues for non-existent devices. Any synchronize_rcu() in queue creation or destroy path may introduce long latency during booting, see detailed description in comment of blk_register_queue(). This patch removes one synchronize_rcu() inside blk_cleanup_queue() for this case, commit c2856ae2f315d75(blk-mq: quiesce queue before freeing queue) needs synchronize_rcu() for implementing blk_mq_quiesce_queue(), but when queue isn't initialized, it isn't necessary to do that since only pass-through requests are involved, no original issue in scsi_execute() at all. Without this patch and previous one, it may take more 20+ seconds for virtio-scsi to complete disk probe. With the two patches, the time becomes less than 100ms. Thanks, Ming