On Sun, Nov 24, 2024 at 11:48:46PM +0100, Marco Elver wrote: > +Cc RCU > > On Sun, 24 Nov 2024 at 23:47, syzbot > <syzbot+16a19b06125a2963eaee@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: 42d9e8b7ccdd Merge tag 'powerpc-6.13-1' of git://git.kerne.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=10a00778580000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=3d7fd5be0e73b8b > > dashboard link: https://syzkaller.appspot.com/bug?extid=16a19b06125a2963eaee > > compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40 > > > > Unfortunately, I don't have any reproducer for this issue yet. > > > > Downloadable assets: > > disk image: https://storage.googleapis.com/syzbot-assets/ef231513adc7/disk-42d9e8b7.raw.xz > > vmlinux: https://storage.googleapis.com/syzbot-assets/54caaac5960b/vmlinux-42d9e8b7.xz > > kernel image: https://storage.googleapis.com/syzbot-assets/85b5a6566143/bzImage-42d9e8b7.xz > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+16a19b06125a2963eaee@xxxxxxxxxxxxxxxxxxxxxxxxx > > > > ================================================================== > > BUG: KCSAN: assert: race in srcu_get_delay kernel/rcu/srcutree.c:658 [inline] > > BUG: KCSAN: assert: race in srcu_funnel_gp_start kernel/rcu/srcutree.c:1089 [inline] > > BUG: KCSAN: assert: race in srcu_gp_start_if_needed+0x808/0x9f0 kernel/rcu/srcutree.c:1339 Hmmm... All of those are from slow paths, so locking looks to be the best approach. A very lightly tested prototype patch is shown below (for which feedback is most welcome), and thank you all for your testing efforts! Thanx, Paul ------------------------------------------------------------------------ commit a955c6a7168f7b204784e4ef7e4db9d017043f73 Author: Paul E. McKenney <paulmck@xxxxxxxxxx> Date: Fri Jan 3 17:04:49 2025 -0800 srcu: Force synchronization for srcu_get_delay() Currently, srcu_get_delay() can be called concurrently, for example, by a CPU that is the first to request a new grace period and the CPU processing the current grace period. Although concurrent access is harmless, it unnecessarily expands the state space. Additionally, all calls to srcu_get_delay() are from slow paths. This commit therefore protects all calls to srcu_get_delay() with ssp->srcu_sup->lock, which is already held on the invocation from the srcu_funnel_gp_start() function. While in the area, this commit also adds a lockdep_assert_held() to srcu_get_delay() itself. Reported-by: syzbot+16a19b06125a2963eaee@xxxxxxxxxxxxxxxxxxxxxxxxx Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 7c7304dee6457..a60acc9cf2f32 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -648,6 +648,7 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp) unsigned long jbase = SRCU_INTERVAL; struct srcu_usage *sup = ssp->srcu_sup; + lockdep_assert_held(&ACCESS_PRIVATE(ssp->srcu_sup, lock)); if (srcu_gp_is_expedited(ssp)) jbase = 0; if (rcu_seq_state(READ_ONCE(sup->srcu_gp_seq))) { @@ -675,9 +676,13 @@ static unsigned long srcu_get_delay(struct srcu_struct *ssp) void cleanup_srcu_struct(struct srcu_struct *ssp) { int cpu; + unsigned long delay; struct srcu_usage *sup = ssp->srcu_sup; - if (WARN_ON(!srcu_get_delay(ssp))) + spin_lock_irq_rcu_node(ssp->srcu_sup); + delay = srcu_get_delay(ssp); + spin_unlock_irq_rcu_node(ssp->srcu_sup); + if (WARN_ON(!delay)) return; /* Just leak it! */ if (WARN_ON(srcu_readers_active(ssp))) return; /* Just leak it! */ @@ -1100,7 +1105,9 @@ static bool try_check_zero(struct srcu_struct *ssp, int idx, int trycount) { unsigned long curdelay; + spin_lock_irq_rcu_node(ssp->srcu_sup); curdelay = !srcu_get_delay(ssp); + spin_unlock_irq_rcu_node(ssp->srcu_sup); for (;;) { if (srcu_readers_active_idx_check(ssp, idx)) @@ -1849,7 +1856,9 @@ static void process_srcu(struct work_struct *work) ssp = sup->srcu_ssp; srcu_advance_state(ssp); + spin_lock_irq_rcu_node(ssp->srcu_sup); curdelay = srcu_get_delay(ssp); + spin_unlock_irq_rcu_node(ssp->srcu_sup); if (curdelay) { WRITE_ONCE(sup->reschedule_count, 0); } else {