Re: [PATCH 10/12] block: move blk_exit_queue into disk_release

Ming Lei <ming.lei@xxxxxxxxxx> · Wed, 23 Feb 2022 14:56:37 +0800

On Tue, Feb 22, 2022 at 10:29:47AM -0800, Bart Van Assche wrote:
> On 2/22/22 06:14, Christoph Hellwig wrote:
> > From: Ming Lei <ming.lei@xxxxxxxxxx>
> > 
> > There can't be FS IO in disk_release(), so move blk_exit_queue() there.
> > 
> > We still need to freeze queue here since the request is freed after the
> > bio is completed and passthrough request rely on scheduler tags as well.
> > 
> > The disk can be released before or after queue is cleaned up, and we have
> > to free the scheduler request pool before blk_cleanup_queue returns,
> > while the static request pool has to be freed before exiting the
> > I/O scheduler.
> 
> This patch looks dubious to me because:
> - The blk_freeze_queue() call in blk_cleanup_queue() waits for pending
>   requests to finish, so why to move blk_exit_queue() from
>   blk_cleanup_queue() into disk_release()?

scsi disk may be released before calling blk_cleanup_queue(), and we
want to tear down all FS related stuff(cgroup, rqos, elevator) in disk_release().

And FS bios have been drained already when releasing disk.

> - I'm concerned that this patch will break user space, e.g. scripts that
>   try to unload an I/O scheduler kernel module immediately after having
>   removed a request queue.

When removing a request queue, the associated disk has been removed
already, and queue's kobject has been deleted too, so how can userspace
unload I/O scheduler at that time?

> 
> > +static void blk_mq_release_queue(struct request_queue *q)
> > +{
> > +	blk_mq_cancel_work_sync(q);
> > +
> > +	/*
> > +	 * There can't be any non non-passthrough bios in flight here, but
> > +	 * requests stay around longer, including passthrough ones so we
> > +	 * still need to freeze the queue here.
> > +	 */
> > +	blk_mq_freeze_queue(q);
> 
> The above comment should be elaborated since what matters in this context is
> not whether or not any bios are still in flight but what happens with the
> request structures.

Yeah, bios have been done, but request is done after bio is ended, see
blk_update_request(), that is why we added blk_mq_freeze_queue() here.

> As you know blk_queue_enter() fails after the DYING flag
> has been set, a flag that is set by blk_cleanup_queue().blk_cleanup_queue()
> already freezes the queue. So why is it necessary to call
> blk_mq_freeze_queue() from blk_mq_release_queue()?

disk may be released before calling blk_cleanup_queue().

But I admit here the name of blk_mq_release_queue() is very misleading,
maybe blk_mq_release_io_queue() is better?

Thanks,
Ming