On Fri, Jun 16, 2023 at 07:48:00AM +0200, Christoph Hellwig wrote: > On Thu, Jun 15, 2023 at 11:43:51PM +0800, Ming Lei wrote: > > On Thu, Jun 15, 2023 at 09:16:27AM -0600, Keith Busch wrote: > > > On Thu, Jun 15, 2023 at 10:32:33PM +0800, Ming Lei wrote: > > > > NVMe calls freeze/unfreeze in different contexts, and controller removal > > > > may break in-progress error recovery, then leave queues in frozen state. > > > > So cause IO hang in del_gendisk() because pending writeback IOs are > > > > still waited in bio_queue_enter(). > > > > > > Shouldn't those writebacks be unblocked by the existing check in > > > bio_queue_enter, test_bit(GD_DEAD, &disk->state))? Or are we missing a > > > disk state update or wakeup on this condition? > > > > GD_DEAD is only set if the device is really dead, then all pending IO > > will be failed. > > del_gendisk also sets GD_DEAD early on. No. The hang happens in fsync_bdev() of del_gendisk(), and there are IOs pending on bio_queue_enter(). Thanks, Ming