On Wed, Dec 18, 2024 at 05:50:05PM +0100, Frederic Weisbecker wrote: > 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier") > was introduced to fix stalls with scheduler bandwidth timers getting > migrated while some kthreads handling CPU hotplug rely on bandwidth. > > However this has introduced several other issues which used to be > confined to RCU. But not anymore as it is spreading to hotplug code > itself (https://lore.kernel.org/all/20241213203739.1519801-1-usamaarif642@xxxxxxxxx/) > > Instead of introducing yet another new hackery, fix the problem in > hrtimers for everyone. The good news is that this passes 12 hours of 400*TREE03. (Yay!!!) The so-so news is that this gives only about 70% confidence that these patches help, but on the other hand, it also gives much higher confidence that these patches are not hurting anything. At least for TREE03. The not-so-good news is that this series causes build failures for rcutorture scenarios (such as SRCU-T) that build with CONFIG_SMP=n: ------------------------------------------------------------------------ kernel/time/hrtimer.c: In function ‘enqueue_hrtimer_offline’: kernel/time/hrtimer.c:1229:42: error: ‘migration_base’ undeclared (first use in this function); did you mean ‘is_migration_base’? ------------------------------------------------------------------------ When built with KCSAN enabled (--kcsan to kvm.sh), there is this additional build failure on that same line of code: ------------------------------------------------------------------------ kernel/time/hrtimer.c:1229:3: error: incompatible pointer types assigning to 'volatile typeof (timer->base)' (aka 'struct hrtimer_clock_base *volatile') from 'bool (*)(struct hrtimer_clock_base *)' (aka '_Bool (*)(struct hrtimer_clock_base *)') [-Werror,-Wincompatible-pointer-types] 1229 | WRITE_ONCE(timer->base, &migration_base); ------------------------------------------------------------------------ Me, I am a bit surprised that enqueue_hrtimer_offline() is even built in a CONFIG_SMP=n kernel. But there might be some reason why #ifdef-ing out that function's body would be a bad idea, so over to you! ;-) Thanx, Paul > Frederic Weisbecker (3): > hrtimers: Force migrate away hrtimers queued after > CPUHP_AP_HRTIMERS_DYING > rcu: Remove swake_up_one_online() bandaid > Revert "rcu/nocb: Fix rcuog wake-up from offline softirq" > > include/linux/hrtimer_defs.h | 1 + > kernel/rcu/tree.c | 34 +------------------- > kernel/rcu/tree_exp.h | 2 +- > kernel/rcu/tree_nocb.h | 10 ++---- > kernel/time/hrtimer.c | 60 +++++++++++++++++++++++++++++++----- > 5 files changed, 58 insertions(+), 49 deletions(-) > > -- > 2.46.0 >