On Sat, Apr 9, 2022 at 7:25 AM Bart Van Assche <bvanassche@xxxxxxx> wrote: > > On 4/8/22 15:50, Bob Pearson wrote: > > Actually it doesn't hang forever but I get the following > > > > ...... > > [ 107.579576] sd 4:0:0:0: [sdb] Synchronizing SCSI cache > > > > [ 291.970133] sd 4:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=DID_TIME_OUT driverbyte=DRIVER_OK > > > > [ 292.247547] rdma_rxe: unloaded > > > > So it waits for about 3 minutes for something and then gives up. > > (+Christoph) > > Hi Bob, > > I can reproduce this behavior with the Soft-iWARP driver by running the > following command: > > cd blktests && use_siw=1 ./check -q srp/001 > > Christoph, the call stack involved in this issue is as follows: > > __schedule+0x4c3/0xd20 > schedule+0x82/0x110 > schedule_timeout+0x122/0x200 > io_schedule_timeout+0x7b/0xc0 > __wait_for_common+0x2bc/0x380 > wait_for_completion_io_timeout+0x1d/0x20 > blk_execute_rq+0x1db/0x200 > __scsi_execute+0x1fb/0x310 > sd_sync_cache+0x155/0x2c0 [sd_mod] > sd_shutdown+0xbb/0x190 [sd_mod] > sd_remove+0x5b/0x80 [sd_mod] > device_remove+0x9a/0xb0 > device_release_driver_internal+0x2c5/0x360 > device_release_driver+0x12/0x20 > bus_remove_device+0x1aa/0x270 > device_del+0x2d4/0x640 > __scsi_remove_device+0x168/0x1a0 > scsi_forget_host+0xa8/0xb0 > scsi_remove_host+0x9b/0x150 > sdebug_driver_remove+0x3d/0x140 [scsi_debug] > device_remove+0x6f/0xb0 > device_release_driver_internal+0x2c5/0x360 > device_release_driver+0x12/0x20 > bus_remove_device+0x1aa/0x270 > device_del+0x2d4/0x640 > device_unregister+0x18/0x70 > sdebug_do_remove_host+0x138/0x180 [scsi_debug] > scsi_debug_exit+0x45/0xd5 [scsi_debug] > __do_sys_delete_module.constprop.0+0x210/0x320 > __x64_sys_delete_module+0x1f/0x30 > do_syscall_64+0x35/0x80 > entry_SYSCALL_64_after_hwframe+0x44/0xae > > One of the functions in the above call stack is sd_remove(). sd_remove() > calls del_gendisk() just before calling sd_shutdown(). sd_shutdown() > submits the SYNCHRONIZE CACHE command. In del_gendisk() I found the > following comment: "Fail any new I/O". Do you agree that failing new I/O > before sd_shutdown() is called is wrong? Is there any other way to fix > this than moving the blk_queue_start_drain() etc. calls out of > del_gendisk() and into a new function? > > Thanks, > > Bart. > I reported/bisected for this issue last week, not sure if it helped. https://lore.kernel.org/linux-block/CAHj4cs9OTm9sb_5fmzgz+W9OSLeVPKix3Yri856kqQVccwd_Mw@xxxxxxxxxxxxxx/T/#t -- Best Regards, Yi Zhang