On Tue, Sep 20, 2022 at 11:46:45AM +0200, Frederic Weisbecker wrote: > On Tue, Sep 20, 2022 at 03:26:28PM +0800, Pingfan Liu wrote: > > On Fri, Sep 16, 2022 at 03:42:58PM +0200, Frederic Weisbecker wrote: > > > Note this is only locking the rdp's node, not the root node. > > > Therefore if CPU 0 and CPU 256 are going off at the same time and they > > > don't belong to the same node, the above won't protect against concurrent > > > TICK_DEP_BIT_RCU set/clear. > > > > > > > Nice, thanks for the careful thoughts. How about moving the counting > > place to the root node? > > You could but then you'd need to lock the root node. > > > > My suspicion is that we don't need this TICK_DEP_BIT_RCU tick dependency > > > anymore. I believe it was there because of issues that were fixed with: > > > > > > 53e87e3cdc15 (timers/nohz: Last resort update jiffies on nohz_full IRQ entry) > > > and: > > > > > > a1ff03cd6fb9 (tick: Detect and fix jiffies update stall) > > > > > > It's unfortunately just suspicion because the reason for that tick dependency > > > is unclear but I believe it should be safe to remove now. > > > > > > > I have gone through this tick dependency again, but got less. > > > > I think at least from the RCU's viewpoint, it is useless since > > multi_cpu_stop()->rcu_momentary_dyntick_idle() has eliminate the > > requirement for tick interrupt. > > Partly yes. > > > Is there a way to have a convincing test so that these code can be removed? > > Or this code will be got along with? > > Hmm, Paul might remember which rcutorture scenario would trigger it? TREE04 on multisocket systems, preferably with faster CPU-hotplug operations. This can be accomplished by adding this to the kvm.sh command line: rcutorture.onoff_interval=200 rcutorture.onoff_holdoff=30 It does take some time to run. I did 4,000 hours worth of TREE04 to confirm lack of bug. But an 80-CPU dual-socket system can run 10 concurrent instances of TREE04, which gets things down to a more manageable 400 hours. Please let me know if you don't have access to a few such systems. I will let Frederic identify which commit(s) should be reverted in order to test the test. Thanx, Paul