Re: [PATCH 5/7] scsi: ufs: core: Simplify ufshcd_err_handling_prepare()

Bart Van Assche <bvanassche@xxxxxxx> · Mon, 21 Oct 2024 13:41:21 -0700

On 10/21/24 2:43 AM, Peter Wang (王信友) wrote:
Using blk_mq_quiesce_tagset instead of ufshcd_scsi_block_requests
could cause issues. After the patch below was merged, Mediatek
received three cases of IO hang.
77691af484e2 ("scsi: ufs: core: Quiesce request queues before checking
pending cmds")
I think this patch might need to be reverted first.

Here is backtrace of IO hang.
ppid=3952 pid=3952 D cpu=6 prio=120 wait=188s kworker/u16:0
	vmlinux __synchronize_srcu() + 216
</proc/self/cwd/common/kernel/rcu/srcutree.c:1386>
	vmlinux synchronize_srcu() + 276
</proc/self/cwd/common/kernel/rcu/srcutree.c:0>
	vmlinux blk_mq_wait_quiesce_done() + 20
</proc/self/cwd/common/block/blk-mq.c:226>
	vmlinux blk_mq_quiesce_tagset() + 156
</proc/self/cwd/common/block/blk-mq.c:286>
	vmlinux ufshcd_clock_scaling_prepare(timeout_us=1000000) + 16
</proc/self/cwd/common/drivers/ufs/core/ufshcd.c:1276>
	vmlinux ufshcd_devfreq_scale() + 52
</proc/self/cwd/common/drivers/ufs/core/ufshcd.c:1322>
	vmlinux ufshcd_devfreq_target() + 384
</proc/self/cwd/common/drivers/ufs/core/ufshcd.c:1440>
	vmlinux devfreq_set_target(flags=0) + 184
</proc/self/cwd/common/drivers/devfreq/devfreq.c:363>
	vmlinux devfreq_update_target(freq=0) + 296
</proc/self/cwd/common/drivers/devfreq/devfreq.c:429>
	vmlinux update_devfreq() + 8
</proc/self/cwd/common/drivers/devfreq/devfreq.c:444>
	vmlinux devfreq_monitor() + 48
</proc/self/cwd/common/drivers/devfreq/devfreq.c:460>
	vmlinux process_one_work() + 476
</proc/self/cwd/common/kernel/workqueue.c:2643>
	vmlinux process_scheduled_works() + 580
</proc/self/cwd/common/kernel/workqueue.c:2717>
	vmlinux worker_thread() + 576
</proc/self/cwd/common/kernel/workqueue.c:2798>
	vmlinux kthread() + 272
</proc/self/cwd/common/kernel/kthread.c:388>
	vmlinux 0xFFFFFFE239A164EC()
</proc/self/cwd/common/arch/arm64/kernel/entry.S:846>

Hi Peter,

Thank you very much for having reported this hang early. Would it be
possible for you to test the patch below on top of this patch series?
I think the root cause of the hang that you reported is in the block
layer.

Thanks,

Bart.

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 7b02188feed5..7482e682deca 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -283,8 +283,9 @@ void blk_mq_quiesce_tagset(struct blk_mq_tag_set *set)
 		if (!blk_queue_skip_tagset_quiesce(q))
 			blk_mq_quiesce_queue_nowait(q);
 	}
-	blk_mq_wait_quiesce_done(set);
 	mutex_unlock(&set->tag_list_lock);
+
+	blk_mq_wait_quiesce_done(set);
 }
 EXPORT_SYMBOL_GPL(blk_mq_quiesce_tagset);