On (24/10/08 12:02), YangYang wrote: > On 2024/10/3 16:56, Sergey Senozhatsky wrote: > > Hello, > > > > I'm looking at a report from the fleet (don't have a reproducer) > > and wondering what you and block folks might think / suggest. > > > > The problem is basically as follows > > > > CPU0 > > > > do_syscall > > sys_close > > __fput > > blkdev_release > > blkdev_put grabs ->open_mutex > > sr_block_release > > scsi_set_medium_removal > > ioctl_internal_command > > scsi_execute_cmd > > scsi_alloc_request > > blk_mq_alloc_request > > blk_queue_enter > > schedule > > > > at the same time: > > > > CPU1 > > > > usb_disconnect > > usb_disable_device > > device_del > > usb_unbind_interface > > usb_stor_disconnect > > scsi_remove_host > > scsi_forget_host > > __scsi_remove_device > > device_del > > bus_remove_device > > device_release_driver_internal > > sr_remove > > del_gendisk > > mutex_lock attempts to grab ->open_mutex > > schedule > > > > I'm a little confused here. How is the queue getting frozen in this > scenario? I don't know. Could it be that it's PM not frozen queue that falsifies wait_event() condition? (if that's what you are pointing at). I have several reports (various devices, various use-cases) and the ones that I looked at so far have the same pattern: usb_disconnect() vs blk_queue_enter() E.g. one of the reports: ... sd 1:0:0:0: [sdb] Attached SCSI removable disk usb 3-4: USB disconnect, device number 29 sd 1:0:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=15s sd 1:0:0:0: [sdb] tag#0 CDB: Read(10) 28 00 07 47 af fd 00 00 01 00 I/O error, dev sdb, sector 122138621 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2 device offline error, dev sdb, sector 122138616 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2 Buffer I/O error on dev sdb, logical block 15267327, async page read ... schedule+0x4f4/0x1540 del_gendisk+0x136/0x370 sd_remove+0x30/0x60 device_release_driver_internal+0x1a2/0x2a0 bus_remove_device+0x154/0x180 device_del+0x207/0x370 __scsi_remove_device+0xc0/0x170 scsi_forget_host+0x45/0x60 scsi_remove_host+0x87/0x170 usb_stor_disconnect+0x63/0xb0 usb_unbind_interface+0xbe/0x250 device_release_driver_internal+0x1a2/0x2a0 bus_remove_device+0x154/0x180 device_del+0x207/0x370 ? kobject_release+0x56/0xb0 usb_disable_device+0x72/0x170 usb_disconnect+0xeb/0x280 schedule+0x4f4/0x1540 blk_queue_enter+0x172/0x250 blk_mq_alloc_request+0x167/0x210 scsi_execute_cmd+0x65/0x240 ioctl_internal_command+0x6c/0x150 scsi_set_medium_removal+0x63/0xc0 sd_release+0x42/0x50 blkdev_put+0x13b/0x1f0 blkdev_release+0x2b/0x40 __fput_sync+0x9b/0x2c0 __se_sys_close+0x69/0xc0 do_syscall_64+0x60/0x90 Or another report: sr 1:0:0:0: Power-on or device reset occurred sr 1:0:0:0: [sr0] scsi3-mmc drive: 8x/24x writer dvd-ram cd/rw xa/form2 cdda tray usb 1-1.3.1: USB disconnect, device number 27 schedule+0x554/0x1218 schedule_preempt_disabled+0x30/0x50 mutex_lock+0x3c/0x70 del_gendisk+0xe8/0x370 sr_remove+0x30/0x58 [sr_mod (HASH:d5f2 4)] device_release_driver_internal+0x1a0/0x278 device_release_driver+0x24/0x38 bus_remove_device+0x150/0x170 device_del+0x1d0/0x348 __scsi_remove_device+0xb4/0x198 scsi_forget_host+0x5c/0x80 scsi_remove_host+0x98/0x1c8 usb_stor_disconnect+0x74/0x110 usb_unbind_interface+0xcc/0x250 device_release_driver_internal+0x1a0/0x278 device_release_driver+0x24/0x38 bus_remove_device+0x150/0x170 device_del+0x1d0/0x348 usb_disable_device+0x88/0x190 usb_disconnect+0xf8/0x318 schedule+0x554/0x1218 blk_queue_enter+0xd0/0x170 blk_mq_alloc_request+0x138/0x1e8 scsi_execute_cmd+0x88/0x258 scsi_test_unit_ready+0x88/0x118 sr_drive_status+0x5c/0x160 [sr_mod (HASH:d5f2 4)] cdrom_ioctl+0x7d4/0x2730 [cdrom (HASH:37c3 5)] sr_block_ioctl+0xa8/0x110 [sr_mod (HASH:d5f2 4)] blkdev_ioctl+0x468/0xbf0 __arm64_sys_ioctl+0x254/0x6d0