On Mon, Feb 14, 2022 at 10:51:07AM +0100, Markus Blöchl wrote: > After the surprise removal of a mounted NVMe disk the pciehp task > reliably hangs forever with a trace similar to this one: > > INFO: task irq/43-pciehp:64 blocked for more than 120 seconds. > Call Trace: > <TASK> > __bio_queue_enter > blk_mq_submit_bio > submit_bio_noacct > submit_bio_wait > blkdev_issue_flush > ext4_sync_fs > sync_filesystem > fsync_bdev > delete_partition > blk_drop_partitions > del_gendisk > nvme_ns_remove > nvme_remove_namespaces > nvme_remove > pci_device_remove > __device_release_driver > device_release_driver > pci_stop_bus_device > pci_stop_and_remove_bus_device > pciehp_unconfigure_device > pciehp_disable_slot > pciehp_handle_presence_or_link_change > pciehp_ist > </TASK> > > I observed this with 5.15.5 from debian bullseye-backports and confirmed > with 5.17.0-rc3 but previous kernels may be affected as well. Thanks for the patch. Entering the queue used to fail if blk_queue_dying() was true. The condition was changed in commit: 8e141f9eb803e209714a80aa6ec073893f94c526 block: drain file system I/O on del_gendisk I can't actually tell if not checking the DYING flag check was intentional or not, since the comments in blk_queue_start_drain() say otherwise. Christoph, do you know the intention here? Should __bio_queue_enter() check the queue DYING flag, or do you prefer drivers explicity set the disk state like this? It looks to me the queue flags should be checked since that's already tied to the freeze wait_queue_head_t. > @@ -4573,6 +4573,8 @@ static void nvme_set_queue_dying(struct nvme_ns *ns) > if (test_and_set_bit(NVME_NS_DEAD, &ns->flags)) > return; > > + set_bit(GD_DEAD, &ns->disk->state); > + > blk_set_queue_dying(ns->queue); > nvme_start_ns_queue(ns);