On 2021/8/9 22:13, Li Jinlin wrote: > From: Li Jinlin <lijinlin3@xxxxxxxxxx> > > We found a hang issue, the test steps are as follows: > 1. blocking device via scsi_device_set_state() > 2. dd if=/dev/sda of=/mnt/t.log bs=1M count=10 > 3. echo none > /sys/block/sda/queue/scheduler > 4. echo "running" >/sys/block/sda/device/state > > Step 3 and 4 should finish this work after step 4, but they hangs. > > CPU#0 CPU#1 CPU#2 > --------------- ---------------- ---------------- > Step 1: blocking device > > Step 2: dd xxxx > ^^^^^^ get request > q_usage_counter++ > > Step 3: switching scheculer > elv_iosched_store > elevator_switch > blk_mq_freeze_queue > blk_freeze_queue > > blk_freeze_queue_start > ^^^^^^ mq_freeze_depth++ > > > blk_mq_run_hw_queues > ^^^^^^ can't run queue when dev blocked > > > blk_mq_freeze_queue_wait > ^^^^^^ Hang here!!! > wait q_usage_counter==0 > > Step 4: running device > store_state_field > scsi_rescan_device > scsi_attach_vpd > scsi_vpd_inquiry > __scsi_execute > blk_get_request > blk_mq_alloc_request > blk_queue_enter > ^^^^^^ Hang here!!! > wait mq_freeze_depth==0 > > blk_mq_run_hw_queues > ^^^^^^ dispatch IO, q_usage_counter will reduce to zero > > blk_mq_unfreeze_queue > ^^^^^ mq_freeze_depth-- > > Step 3 and 4 wait for each other. > > To fix this, we need to run queue before rescanning device when the device > state changes to SDEV_RUNNING. > > Fixes: f0f82e2476f6 ("scsi: core: Fix capacity set to zero after offlinining device") > Signed-off-by: Li Jinlin <lijinlin3@xxxxxxxxxx> > Signed-off-by: Qiu Laibin <qiulaibin@xxxxxxxxxx> > --- > changes since v1 send with Message-ID: > 20210805143231.1713299-1-lijinlin3@xxxxxxxxxx > > - Modify the subject to make it distinct > - Modify the message to fix typo and make it distinct > - Reduce the number of SOB > > drivers/scsi/scsi_sysfs.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c > index c3a710bceba0..aa701582c950 100644 > --- a/drivers/scsi/scsi_sysfs.c > +++ b/drivers/scsi/scsi_sysfs.c > @@ -809,12 +809,12 @@ store_state_field(struct device *dev, struct device_attribute *attr, > ret = scsi_device_set_state(sdev, state); > /* > * If the device state changes to SDEV_RUNNING, we need to > - * rescan the device to revalidate it, and run the queue to > - * avoid I/O hang. > + * run the queue to avoid I/O hang, and rescan the device > + * to revalidate it. > */ > if (ret == 0 && state == SDEV_RUNNING) { > - scsi_rescan_device(dev); > blk_mq_run_hw_queues(sdev->request_queue, true); > + scsi_rescan_device(dev); > } > mutex_unlock(&sdev->state_mutex); > > Ping. Thanks, Li Jinlin