Re: [PATCH next] sbitmap: fix lockup while swapping

Keith Busch <kbusch@xxxxxxxxxx> · Fri, 23 Sep 2022 13:07:19 -0600

Does the following fix the observation? Rational being that there's no reason
to spin on the current wait state that is already under handling; let
subsequent clearings proceed to the next inevitable wait state immediately.

---

diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index 624fa7f118d1..47bf7882210b 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -634,6 +634,13 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq, int *nr)
 
 	*nr -= sub;
 
+	/*
+	 * Increase wake_index before updating wait_cnt, otherwise concurrent
+	 * callers can see valid wait_cnt in old waitqueue, which can cause
+	 * invalid wakeup on the old waitqueue.
+	 */
+	sbq_index_atomic_inc(&sbq->wake_index);
+
 	/*
 	 * When wait_cnt == 0, we have to be particularly careful as we are
 	 * responsible to reset wait_cnt regardless whether we've actually
@@ -660,13 +667,6 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq, int *nr)
 	 * of atomic_set().
 	 */
 	smp_mb__before_atomic();
-
-	/*
-	 * Increase wake_index before updating wait_cnt, otherwise concurrent
-	 * callers can see valid wait_cnt in old waitqueue, which can cause
-	 * invalid wakeup on the old waitqueue.
-	 */
-	sbq_index_atomic_inc(&sbq->wake_index);
 	atomic_set(&ws->wait_cnt, wake_batch);
 
 	return ret || *nr;
--