On Tue, Feb 22, 2022 at 10:29:47AM -0800, Bart Van Assche wrote: > On 2/22/22 06:14, Christoph Hellwig wrote: > > From: Ming Lei <ming.lei@xxxxxxxxxx> > > > > There can't be FS IO in disk_release(), so move blk_exit_queue() there. > > > > We still need to freeze queue here since the request is freed after the > > bio is completed and passthrough request rely on scheduler tags as well. > > > > The disk can be released before or after queue is cleaned up, and we have > > to free the scheduler request pool before blk_cleanup_queue returns, > > while the static request pool has to be freed before exiting the > > I/O scheduler. > > This patch looks dubious to me because: > - The blk_freeze_queue() call in blk_cleanup_queue() waits for pending > requests to finish, so why to move blk_exit_queue() from > blk_cleanup_queue() into disk_release()? scsi disk may be released before calling blk_cleanup_queue(), and we want to tear down all FS related stuff(cgroup, rqos, elevator) in disk_release(). And FS bios have been drained already when releasing disk. > - I'm concerned that this patch will break user space, e.g. scripts that > try to unload an I/O scheduler kernel module immediately after having > removed a request queue. When removing a request queue, the associated disk has been removed already, and queue's kobject has been deleted too, so how can userspace unload I/O scheduler at that time? > > > +static void blk_mq_release_queue(struct request_queue *q) > > +{ > > + blk_mq_cancel_work_sync(q); > > + > > + /* > > + * There can't be any non non-passthrough bios in flight here, but > > + * requests stay around longer, including passthrough ones so we > > + * still need to freeze the queue here. > > + */ > > + blk_mq_freeze_queue(q); > > The above comment should be elaborated since what matters in this context is > not whether or not any bios are still in flight but what happens with the > request structures. Yeah, bios have been done, but request is done after bio is ended, see blk_update_request(), that is why we added blk_mq_freeze_queue() here. > As you know blk_queue_enter() fails after the DYING flag > has been set, a flag that is set by blk_cleanup_queue().blk_cleanup_queue() > already freezes the queue. So why is it necessary to call > blk_mq_freeze_queue() from blk_mq_release_queue()? disk may be released before calling blk_cleanup_queue(). But I admit here the name of blk_mq_release_queue() is very misleading, maybe blk_mq_release_io_queue() is better? Thanks, Ming