On Tue, Jul 30, 2024 at 08:10:33PM -0700, Darrick J. Wong wrote: > On Tue, Jul 30, 2024 at 05:19:50PM -0700, Darrick J. Wong wrote: > > On Tue, Jul 30, 2024 at 03:26:26PM +0200, Peter Zijlstra wrote: > > > On Tue, Jul 30, 2024 at 01:00:02PM +0530, Chandan Babu R wrote: > > > > On Mon, Jul 29, 2024 at 08:38:49 PM -0700, Darrick J. Wong wrote: > > > > > Hi everyone, > > > > > > > > > > I got the following splat on 6.11-rc1 when I tried to QA xfs online > > > > > fsck. Does this ring a bell for anyone? I'll try bisecting in the > > > > > morning to see if I can find the culprit. > > > > > > > > xfs/566 on v6.11-rc1 would consistently cause the oops mentioned below. > > > > However, I was able to get xfs/566 to successfully execute for five times on a > > > > v6.11-rc1 kernel with the following commits reverted, > > > > > > > > 83ab38ef0a0b2407d43af9575bb32333fdd74fb2 > > > > 695ef796467ed228b60f1915995e390aea3d85c6 > > > > 9bc2ff871f00437ad2f10c1eceff51aaa72b478f > > > > > > > > Reinstating commit 83ab38ef0a0b2407d43af9575bb32333fdd74fb2 causes the kernel > > > > to oops once again. > > > > > > Durr, does this help? > > > > Yes, it does! After ~8, a full fstests run completes without incident. > > > > (vs. before where it would blow up within 2 minutes) > > > > Thanks for the fix; you can add > > Tested-by: Darrick J. Wong <djwong@xxxxxxxxxx> > > Ofc as soon as this I push it to the whole fleet then things start > failing again. :( Sooooo... it turns out that somehow your patch got mismerged on the first go-round, and that worked. The second time, there was no mismerge, which mean that the wrong atomic_cmpxchg() callsite was tested. Looking back at the mismerge, it actually changed __static_key_slow_dec_cpuslocked, which had in 6.10: if (atomic_dec_and_test(&key->enabled)) jump_label_update(key); Decrement, then return true if the value was set to zero. With the 6.11 code, it looks like we want to exchange a 1 with a 0, and act only if the previous value had been 1. So perhaps we really want this change? I'll send it out to the fleet and we'll see what it reports tomorrow morning. --D diff --git a/kernel/jump_label.c b/kernel/jump_label.c index 4ad5ed8adf96..5f80c128e90e 100644 --- a/kernel/jump_label.c +++ b/kernel/jump_label.c @@ -289,7 +289,7 @@ static void __static_key_slow_dec_cpuslocked(struct static_key *key) return; guard(mutex)(&jump_label_mutex); - if (atomic_cmpxchg(&key->enabled, 1, 0)) + if (atomic_cmpxchg(&key->enabled, 1, 0) == 1) jump_label_update(key); else WARN_ON_ONCE(!static_key_slow_try_dec(key));