OK,3ks. -----邮件原件----- 发件人: chenxiang (M) 发送时间: 2017年11月2日 20:34 收件人: Zouming (IT) <zouming.zouming@xxxxxxxxxx>; linux-block@xxxxxxxxxxxxxxx; axboe@xxxxxx 抄送: wangzhoumengjian <wangzhoumengjian@xxxxxxxxxx> 主题: Re: [bug report after v4.5-rc1]block: When the scsi device has a timeout IO, the scsi device is stuck when it is deleted 在 2017/11/2 20:16, Zouming (IT) 写道: > 1.Repeat steps: > (1) send IO on the device /dev/sdx. > (2) Simulate an IO lost > (3) Use the command before to delete scsi device before IO timeout > ehco 1 > /sys/class/sdx/device/delete > > 2.The stack of delete thead is before: > [<ffffffff810999ef>] msleep+0x2f/0x40 > [<ffffffff812f78b4>] __blk_drain_queue+0xa4/0x170 [<ffffffff812f7bfd>] > blk_cleanup_queue+0x13d/0x150 [<ffffffff81473d2a>] > __scsi_remove_device+0x4a/0xd0 [<ffffffff81473dd6>] > scsi_remove_device+0x26/0x40 [<ffffffff81473e05>] > sdev_store_delete_callback+0x15/0x20 > [<ffffffff8127fdc4>] sysfs_schedule_callback_work+0x14/0x60 > [<ffffffff810a881a>] process_one_work+0x17a/0x440 [<ffffffff810a94e6>] > worker_thread+0x126/0x3c0 [<ffffffff810b098f>] kthread+0xcf/0xe0 > [<ffffffff816b4f18>] ret_from_fork+0x58/0x90 > > 3.The reason is before: > (1) When the scsi device is deleted, invoke blk_cleanup_queue funtion to > set the flag of request_queue dying, and wait all IO back. > > (2) when IO timout,the timeout workqueue invoke blk_timeout_work function to abort IO, > but it will not abort the IO because it call blk_queue_enter funtion > judge the request_queue is dying and return direct without doing anything. Hi Zouming, You can have a test on Bart's patch "[PATCH] block: Fix a race between blk_cleanup_queue() and timeout handling" for this issue. I think this patch can solve your issue.