Re: [PATCH] scsi: core: cleanup request queue before releasing gendisk

Christoph Hellwig <hch@xxxxxx> · Fri, 17 Sep 2021 09:56:50 +0200

On Fri, Sep 17, 2021 at 03:41:05PM +0800, Ming Lei wrote:
> >  
> > -	return ret;
> > +	if (unlikely(!disk_live(disk))) {
> > +		blk_queue_exit(disk->queue);
> > +		bio_io_error(bio);
> > +		return -ENODEV;
> > +	}
> 
> Is it safe to fail IO in this way? There is still opened bdev, and
> usually IO can be done successfully even when disk is being deleted.

"normal" I/O should not really happen by the time it is deleted.  That
being said we should do this only after the fsync is done. While no
one should rely on that I'm pretty sure some file systems do.
So we'll actually need a deleted flag.

> Not mention it adds one extra check in real fast path.

I'm not really woried about the check itself.  That being
sais this inode cache line is not hot right now, so moving it to
disk->state will help as we need to check the read-only flag in
the the I/O submission path anyway.

> > +	/*
> > +	 * Prevent new I/O from crossing bio_queue_enter().
> > +	 */
> > +	blk_freeze_queue_start(q);
> > +	if (queue_is_mq(q))
> > +		blk_mq_wake_waiters(q);
> > +	/* Make blk_queue_enter() reexamine the DYING flag. */
> > +	wake_up_all(&q->mq_freeze_wq);
> > +
> > +	rq_qos_exit(q);
> 
> rq_qos_exit() requires the queue to be frozen, otherwise kernel oops
> can be triggered. There may be old submitted bios not completed yet,
> and rq_qos_done() can be referring to q->rq_qos.

Yes.  I actually misread the old code - it atually does two
blk_freeze_queue_start, but it also includes the wait.

> But if waiting for queue frozen is added, one extra freeze is added,
> which will slow done remove disk/queue.

Is it?  For the typical case the second free in blk_cleanp_queue will
be bsically free because there is no pending I/O.  The only case
where that matters if if there is pending passthrough I/O again,
which can only happen with SCSI, and even there is highly unusual.