[RFC v2 15/18] rcu: Clean up timeouts for forcing the quiescent state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This patch does some code refactoring that will help us to split the rcu
kthread into more kthread works. It fixes rather theoretical and innocent
race. Otherwise, the changes should not be visible to the user.

First, it moves the code that limits the maximal timeout into separate
functions, see normalize_jiffies*().

The commit 88d6df612cc3c99f5 ("rcu: Prevent spurious-wakeup DoS attack
on rcu_gp_kthread()") suggests that a spurious wakeup is possible.
In this case, the thread continue waiting and
wait_event_interruptible_timeout() should be called with the remaining
timeout. It is newly computed in the new variable "timeout".

wait_event_interruptible_timeout() returns "1" when the condition is true
after the timeout elapsed. This might happen when there is a race between
fulfilling the condition and the wakeup. Therefore, it is cleaner to
update "rsp->jiffies_force_qs" when QS is forced and do not rely on
the "ret" value.

Finally, this the patch moves cond_resched_rcu_qs() to a single place.
It changes the order of the check for the pending signal. But there never
should be a pending signal. If there was we would have bigger problems
because wait_event() would never sleep again until someone flushed
the signal.

Signed-off-by: Petr Mladek <pmladek@xxxxxxxx>
---
 kernel/rcu/tree.c | 77 ++++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 53 insertions(+), 24 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 5a3e70a21df8..286e300794f0 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2035,13 +2035,45 @@ static void rcu_gp_cleanup(struct rcu_state *rsp)
 }
 
 /*
+ * Normalize, update, and return the first timeout.
+ */
+static unsigned long normalize_jiffies_till_first_fqs(void)
+{
+	unsigned long j = jiffies_till_first_fqs;
+
+	if (unlikely(j > HZ)) {
+		j = HZ;
+		jiffies_till_first_fqs = HZ;
+	}
+
+	return j;
+}
+
+/*
+ * Normalize, update, and return the next timeout.
+ */
+static unsigned long normalize_jiffies_till_next_fqs(void)
+{
+	unsigned long j = jiffies_till_next_fqs;
+
+	if (unlikely(j > HZ)) {
+		j = HZ;
+		jiffies_till_next_fqs = HZ;
+	} else if (unlikely(j < 1)) {
+		j = 1;
+		jiffies_till_next_fqs = 1;
+	}
+
+	return j;
+}
+
+/*
  * Body of kthread that handles grace periods.
  */
 static int __noreturn rcu_gp_kthread(void *arg)
 {
 	int gf;
-	unsigned long j;
-	int ret;
+	unsigned long timeout, j;
 	struct rcu_state *rsp = arg;
 	struct rcu_node *rnp = rcu_get_root(rsp);
 
@@ -2071,22 +2103,18 @@ static int __noreturn rcu_gp_kthread(void *arg)
 
 		/* Handle quiescent-state forcing. */
 		rsp->first_gp_fqs = true;
-		j = jiffies_till_first_fqs;
-		if (j > HZ) {
-			j = HZ;
-			jiffies_till_first_fqs = HZ;
-		}
-		ret = 0;
+		timeout = normalize_jiffies_till_first_fqs();
+		rsp->jiffies_force_qs = jiffies + timeout;
 		for (;;) {
-			if (!ret)
-				rsp->jiffies_force_qs = jiffies + j;
 			trace_rcu_grace_period(rsp->name,
 					       READ_ONCE(rsp->gpnum),
 					       TPS("fqswait"));
 			rsp->gp_state = RCU_GP_WAIT_FQS;
-			ret = wait_event_interruptible_timeout(rsp->gp_wq,
-					rcu_gp_fqs_check_wake(rsp, &gf), j);
+			wait_event_interruptible_timeout(rsp->gp_wq,
+					rcu_gp_fqs_check_wake(rsp, &gf),
+					timeout);
 			rsp->gp_state = RCU_GP_DOING_FQS;
+try_again:
 			/* Locking provides needed memory barriers. */
 			/* If grace period done, leave loop. */
 			if (!READ_ONCE(rnp->qsmask) &&
@@ -2099,28 +2127,29 @@ static int __noreturn rcu_gp_kthread(void *arg)
 						       READ_ONCE(rsp->gpnum),
 						       TPS("fqsstart"));
 				rcu_gp_fqs(rsp);
+				timeout = normalize_jiffies_till_next_fqs();
+				rsp->jiffies_force_qs = jiffies + timeout;
 				trace_rcu_grace_period(rsp->name,
 						       READ_ONCE(rsp->gpnum),
 						       TPS("fqsend"));
-				cond_resched_rcu_qs();
-				WRITE_ONCE(rsp->gp_activity, jiffies);
 			} else {
 				/* Deal with stray signal. */
-				cond_resched_rcu_qs();
-				WRITE_ONCE(rsp->gp_activity, jiffies);
 				WARN_ON(signal_pending(current));
 				trace_rcu_grace_period(rsp->name,
 						       READ_ONCE(rsp->gpnum),
 						       TPS("fqswaitsig"));
 			}
-			j = jiffies_till_next_fqs;
-			if (j > HZ) {
-				j = HZ;
-				jiffies_till_next_fqs = HZ;
-			} else if (j < 1) {
-				j = 1;
-				jiffies_till_next_fqs = 1;
-			}
+			cond_resched_rcu_qs();
+			WRITE_ONCE(rsp->gp_activity, jiffies);
+			/*
+			 * Count the remaining timeout when it was a spurious
+			 * wakeup. Well, it is useful also when we have slept
+			 * in the cond_resched().
+			 */
+			j = jiffies;
+			if (ULONG_CMP_GE(j, rsp->jiffies_force_qs))
+				goto try_again;
+			timeout = rsp->jiffies_force_qs - j;
 		}
 
 		/* Handle grace-period end. */
-- 
1.8.5.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]