[PATCH] scsi: ufs: core: quiesce request queues before check pending cmds

Ziqi Chen <quic_ziqichen@xxxxxxxxxxx> · Fri, 7 Jun 2024 18:06:23 +0800

In ufshcd_clock_scaling_prepare(), after scsi layer is blocked,
ufshcd_pending_cmds() is called to tell whether there are pending
transactions or not. And only if there is no pending transaction,
can we proceed to kick start clock scaling sequence.

ufshcd_pending_cmds() traverses over all scsi devices and calls
sbitmap_weight() on their budget_map. The sbitmap_weight() can break
down to three steps -
1. Calculates the nr outstanding bits set in the 'word' bitmap.
2. Calculates the nr outstanding bits set in the 'cleared' bitmap.
3. Minus the result from step 1 by the result from step 2.

There can be a race condition in below scenario -

Assume there is one pending transaction in the request queue of one scsi
device, say sda, and the budget token of this request is 0, the 'word'
is 0x1 and the 'cleared' is 0x0.

1. When step 1 executes, it gets the result as 1.
2. Before step 2 executes, block layer tries to dispatch a new request
   to sda. Since scsi layer is blocked, the request cannot pass through
   scsi layer, but the block layer would anyways do budget_get() and
   budget_put() to sda's budget map, so the 'word' has become 0x3 and
   'cleared' has become 0x2 (assume the new request got budget token 1).
3. When step 2 executes, it gets the result as 1.
4. When step 3 executes, it gets the result as 0, meaning there is no
   pending transactions, which is wrong.

Thread A                        Thread B
ufshcd_pending_cmds()           __blk_mq_sched_dispatch_requests()
|                               |
sbitmap_weight(word)            |
|                               scsi_mq_get_budget()
|                               |
|                               scsi_mq_put_budget()
|                               |
sbitmap_weight(cleared)
...

When this race condition happens, clock scaling sequence is kicked start
with transactions still in flight, leading to subsequent hibernate enter
failure, broken link, task abort and back to back error recovery.

Fix this race condition by quiescing the request queues before calling
ufshcd_pending_cmds() so that block layer won't touch the budget map
when ufshcd_pending_cmds() is working on it. In addition, remove the
scsi layer blocking/unblocking to reduce redundancies and latencies.

Fixes: 8d077ede48c1 ("scsi: ufs: Optimize the command queueing code")
Co-developed-by: Can Guo <quic_cang@xxxxxxxxxxx>
Signed-off-by: Can Guo <quic_cang@xxxxxxxxxxx>
Signed-off-by: Ziqi Chen <quic_ziqichen@xxxxxxxxxxx>
---
 drivers/ufs/core/ufshcd.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index 21429ee..1afa862 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -1392,7 +1392,7 @@ static int ufshcd_clock_scaling_prepare(struct ufs_hba *hba, u64 timeout_us)
 	 * make sure that there are no outstanding requests when
 	 * clock scaling is in progress
 	 */
-	ufshcd_scsi_block_requests(hba);
+	blk_mq_quiesce_tagset(&hba->host->tag_set);
 	mutex_lock(&hba->wb_mutex);
 	down_write(&hba->clk_scaling_lock);
 
@@ -1401,7 +1401,7 @@ static int ufshcd_clock_scaling_prepare(struct ufs_hba *hba, u64 timeout_us)
 		ret = -EBUSY;
 		up_write(&hba->clk_scaling_lock);
 		mutex_unlock(&hba->wb_mutex);
-		ufshcd_scsi_unblock_requests(hba);
+		blk_mq_unquiesce_tagset(&hba->host->tag_set);
 		goto out;
 	}
 
@@ -1422,7 +1422,7 @@ static void ufshcd_clock_scaling_unprepare(struct ufs_hba *hba, int err, bool sc
 
 	mutex_unlock(&hba->wb_mutex);
 
-	ufshcd_scsi_unblock_requests(hba);
+	blk_mq_unquiesce_tagset(&hba->host->tag_set);
 	ufshcd_release(hba);
 }
 
-- 
2.7.4