Patch "sched/core: Avoid multiple calling update_rq_clock() in __cfsb_csd_unthrottle()" has been added to the 6.4-stable tree

Sasha Levin <sashal@xxxxxxxxxx> · Sun, 9 Jul 2023 01:03:16 -0400

This is a note to let you know that I've just added the patch titled

    sched/core: Avoid multiple calling update_rq_clock() in __cfsb_csd_unthrottle()

to the 6.4-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     sched-core-avoid-multiple-calling-update_rq_clock-in.patch
and it can be found in the queue-6.4 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit f1610ca683018c1384c428597bf20567ef7e4404
Author: Hao Jia <jiahao.os@xxxxxxxxxxxxx>
Date:   Tue Jun 13 16:20:11 2023 +0800

    sched/core: Avoid multiple calling update_rq_clock() in __cfsb_csd_unthrottle()
    
    [ Upstream commit ebb83d84e49b54369b0db67136a5fe1087124dcc ]
    
    After commit 8ad075c2eb1f ("sched: Async unthrottling for cfs
    bandwidth"), we may update the rq clock multiple times in the loop of
    __cfsb_csd_unthrottle().
    
    A prior (although less common) instance of this problem exists in
    unthrottle_offline_cfs_rqs().
    
    Cure both by ensuring update_rq_clock() is called before the loop and
    setting RQCF_ACT_SKIP during the loop, to supress further updates.
    The alternative would be pulling update_rq_clock() out of
    unthrottle_cfs_rq(), but that gives an even bigger mess.
    
    Fixes: 8ad075c2eb1f ("sched: Async unthrottling for cfs bandwidth")
    Reviewed-By: Ben Segall <bsegall@xxxxxxxxxx>
    Suggested-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
    Signed-off-by: Hao Jia <jiahao.os@xxxxxxxxxxxxx>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
    Reviewed-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
    Link: https://lkml.kernel.org/r/20230613082012.49615-4-jiahao.os@xxxxxxxxxxxxx
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 373ff5f558844..4da5f35417626 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5576,6 +5576,14 @@ static void __cfsb_csd_unthrottle(void *arg)
 
 	rq_lock(rq, &rf);
 
+	/*
+	 * Iterating over the list can trigger several call to
+	 * update_rq_clock() in unthrottle_cfs_rq().
+	 * Do it once and skip the potential next ones.
+	 */
+	update_rq_clock(rq);
+	rq_clock_start_loop_update(rq);
+
 	/*
 	 * Since we hold rq lock we're safe from concurrent manipulation of
 	 * the CSD list. However, this RCU critical section annotates the
@@ -5595,6 +5603,7 @@ static void __cfsb_csd_unthrottle(void *arg)
 
 	rcu_read_unlock();
 
+	rq_clock_stop_loop_update(rq);
 	rq_unlock(rq, &rf);
 }
 
@@ -6115,6 +6124,13 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq)
 
 	lockdep_assert_rq_held(rq);
 
+	/*
+	 * The rq clock has already been updated in the
+	 * set_rq_offline(), so we should skip updating
+	 * the rq clock again in unthrottle_cfs_rq().
+	 */
+	rq_clock_start_loop_update(rq);
+
 	rcu_read_lock();
 	list_for_each_entry_rcu(tg, &task_groups, list) {
 		struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)];
@@ -6137,6 +6153,8 @@ static void __maybe_unused unthrottle_offline_cfs_rqs(struct rq *rq)
 			unthrottle_cfs_rq(cfs_rq);
 	}
 	rcu_read_unlock();
+
+	rq_clock_stop_loop_update(rq);
 }
 
 #else /* CONFIG_CFS_BANDWIDTH */
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index ec7b3e0a2b207..81ac605b9cd5c 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1546,6 +1546,28 @@ static inline void rq_clock_cancel_skipupdate(struct rq *rq)
 	rq->clock_update_flags &= ~RQCF_REQ_SKIP;
 }
 
+/*
+ * During cpu offlining and rq wide unthrottling, we can trigger
+ * an update_rq_clock() for several cfs and rt runqueues (Typically
+ * when using list_for_each_entry_*)
+ * rq_clock_start_loop_update() can be called after updating the clock
+ * once and before iterating over the list to prevent multiple update.
+ * After the iterative traversal, we need to call rq_clock_stop_loop_update()
+ * to clear RQCF_ACT_SKIP of rq->clock_update_flags.
+ */
+static inline void rq_clock_start_loop_update(struct rq *rq)
+{
+	lockdep_assert_rq_held(rq);
+	SCHED_WARN_ON(rq->clock_update_flags & RQCF_ACT_SKIP);
+	rq->clock_update_flags |= RQCF_ACT_SKIP;
+}
+
+static inline void rq_clock_stop_loop_update(struct rq *rq)
+{
+	lockdep_assert_rq_held(rq);
+	rq->clock_update_flags &= ~RQCF_ACT_SKIP;
+}
+
 struct rq_flags {
 	unsigned long flags;
 	struct pin_cookie cookie;