On Thu, Sep 16, 2021 at 04:20:09PM +0200, Christoph Hellwig wrote: > On Thu, Sep 16, 2021 at 08:38:16PM +0800, Ming Lei wrote: > > > > and it may cause other trouble at least for scsi disk since sd_shutdown() > > > > follows del_gendisk() and has to be called before blk_cleanup_queue(). > > > > > > Yes. So we need to move the bits of blk_cleanup_queue that deal with > > > the file system I/O state to del_gendisk, and keep blk_cleanup_queue > > > for anything actually needed for the low-level queue. > > > > Can you explain what the bits are in blk_cleanup_queue() for dealing with FS > > I/O state? blk_cleanup_queue() drains and shutdown the queue basically, > > all shouldn't be related with gendisk, and it is fine to implement one > > queue without gendisk involved, such as nvme admin, connect queue or > > sort of stuff. > > > > Wrt. this reported issue, rq_qos_exit() needs to run before releasing > > gendisk, but queue has to put into freezing before calling > > rq_qos_exit(), > > I was curious what you hit, but yes rq_qos_exit is obvious. > blk_flush_integrity also is very much about fs I/O state. > > > > > so looks you suggest to move the following code into > > del_gendisk()? > > something like that. I think we need to split the dying flag into > one for the gendisk and one for the queue first, and make sure the > queue freeze in del_gendisk is released again so that passthrough > still works after. If we do that, q->disk is really unnecessary, so looks the fix of 'd152c682f03c block: add an explicit ->disk backpointer to the request_queue' isn't good. The original issue added in 'edb0872f44ec block: move the bdi from the request_queue to the gendisk' can be fixed simply by moving the two lines code in blk_unregister_queue() to blk_cleanup_queue(): kobject_uevent(&q->kobj, KOBJ_REMOVE); kobject_del(&q->kobj); Thanks, Ming