On Wed, Sep 07, 2022 at 09:40:29AM +0800, Pingfan Liu wrote: > On Tue, Sep 06, 2022 at 10:24:41AM -0700, Paul E. McKenney wrote: > > On Mon, Sep 05, 2022 at 11:38:50AM +0800, Pingfan Liu wrote: > > > At present, during the cpu teardown, rcu_boost_kthread_setaffinity() is > > > called twice. Firstly by rcutree_offline_cpu(), then by > > > rcutree_dead_cpu() as the CPUHP_RCUTREE_PREP cpuhp_step callback. > > > > > > >From the scheduler's perspective, a bit in cpu_online_mask means that the cpu > > > is visible to the scheduler. Furthermore, a bit in cpu_active_mask > > > means that the cpu is suitable as a migration destination. > > > > > > Now turning back to the case in rcu offlining. sched_cpu_deactivate() > > > has disabled the dying cpu as the migration destination before > > > rcutree_offline_cpu(). Furthermore, if the boost kthread is on the dying > > > cpu, it will be migrated to another suitable online cpu by the scheduler. > > > So the affinity setting by rcutree_offline_cpu() is redundant and can be > > > eliminated. > > > > > > Besides, this patch does an trival code rearrangement by unfolding > > > rcutree_affinity_setting() into rcutree_online_cpu(), considering that > > > the latter one is the only user of the former. > > > > > > Signed-off-by: Pingfan Liu <kernelfans@xxxxxxxxx> > > > Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx> > > > Cc: David Woodhouse <dwmw@xxxxxxxxxxxx> > > > Cc: Frederic Weisbecker <frederic@xxxxxxxxxx> > > > Cc: Neeraj Upadhyay <quic_neeraju@xxxxxxxxxxx> > > > Cc: Josh Triplett <josh@xxxxxxxxxxxxxxxx> > > > Cc: Steven Rostedt <rostedt@xxxxxxxxxxx> > > > Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> > > > Cc: Lai Jiangshan <jiangshanlai@xxxxxxxxx> > > > Cc: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> > > > Cc: "Jason A. Donenfeld" <Jason@xxxxxxxxx> > > > To: rcu@xxxxxxxxxxxxxxx > > > --- > > > kernel/rcu/tree.c | 14 +------------- > > > 1 file changed, 1 insertion(+), 13 deletions(-) > > > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > > > index 79aea7df4345..b90f6487fd45 100644 > > > --- a/kernel/rcu/tree.c > > > +++ b/kernel/rcu/tree.c > > > @@ -3978,16 +3978,6 @@ int rcutree_prepare_cpu(unsigned int cpu) > > > return 0; > > > } > > > > > > -/* > > > - * Update RCU priority boot kthread affinity for CPU-hotplug changes. > > > - */ > > > -static void rcutree_affinity_setting(unsigned int cpu, int outgoing) > > > -{ > > > - struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu); > > > - > > > - rcu_boost_kthread_setaffinity(rdp->mynode, outgoing); > > > -} > > > > Good point, tiven how simple a wrapper this is and how little it is used, > > getting rid of it does sound like a reasonable idea. > > > > > /* > > > * Near the end of the CPU-online process. Pretty much all services > > > * enabled, and the CPU is now very much alive. > > > @@ -4006,7 +3996,7 @@ int rcutree_online_cpu(unsigned int cpu) > > > if (rcu_scheduler_active == RCU_SCHEDULER_INACTIVE) > > > return 0; /* Too early in boot for scheduler work. */ > > > sync_sched_exp_online_cleanup(cpu); > > > - rcutree_affinity_setting(cpu, -1); > > > + rcu_boost_kthread_setaffinity(rdp->mynode, -1); > > > > > > // Stop-machine done, so allow nohz_full to disable tick. > > > tick_dep_clear(TICK_DEP_BIT_RCU); > > > @@ -4029,8 +4019,6 @@ int rcutree_offline_cpu(unsigned int cpu) > > > rnp->ffmask &= ~rdp->grpmask; > > > raw_spin_unlock_irqrestore_rcu_node(rnp, flags); > > > > > > - rcutree_affinity_setting(cpu, cpu); > > > > We do need to keep this one because the CPU is going away. > > > > One the other hand, it might well be that we could get rid of the call > > to rcutree_affinity_setting() in rcutree_dead_cpu(). > > > > Or am I missing something subtle here? > > Oops, I think I need to rephrase my commit log to describe this nuance. > The keypoint is whether ->qsmaskinitnext is stable. > > The teardown code path on a single dying cpu looks like: > > sched_cpu_deactivate() // prevent this dying cpu as a migration dst. Suppose that this was the last CPU that the task was permitted to run on. > rcutree_offline_cpu() // as a result, the scheduler core will take care > // of the transient affinity mismatching until > // rcutree_dead_cpu(). (I think it also stands in > // the concurrent offlining) > > rcu_report_dead() // running on the dying cpu, and clear its bit in ->qsmaskinitnext > > rcutree_dead_cpu() // running on the initiator (a initiator cpu will > // execute this function for each dying cpu) > // At this point, ->qsmaskinitnext reflects the > // offlining, and the affinity can get right. > > Sorry that my commit log had emphasized on the first part, but forgot to > mention the ->qsmaskinitnext. > > > Does this justification stand? We should ensure that the task's permitted set of CPUs always contained at least one online CPU. Unless I am missing something, your suggested change will sometimes end up with the task having no online CPUs in its mask. So what am I missing here? Thanx, Paul