On Mon, Jul 05, 2021 at 08:56:54PM +0100, Mark Rutland wrote: > On Mon, Jul 05, 2021 at 06:16:49PM +0300, Anatoly Pugachev wrote: > > Hello! > > Hi Anatoly, > > > latest sparc64 git kernel produces the following OOPS on running stress-ng as : > > > > $ stress-ng -v --mmap 1 -t 30s > > > > kernel OOPS (console logs): > > > > [ 27.276719] Unable to handle kernel NULL pointer dereference > > [ 27.276782] tsk->{mm,active_mm}->context = 00000000000003cb > > [ 27.276818] tsk->{mm,active_mm}->pgd = fff800003a2a0000 > > [ 27.276853] \|/ ____ \|/ > > [ 27.276853] "@'/ .. \`@" > > [ 27.276853] /_| \__/ |_\ > > [ 27.276853] \__U_/ > > [ 27.276927] stress-ng(928): Oops [#1] > > I can reproduce this under QEMU; following your bisection (and working > around the missing ifdeferry that breaks bisection), I can confirm that > the first broken commit is: > > ff5b4f1ed580 ("locking/atomic: sparc: move to ARCH_ATOMIC") > > Sorry about this. > > > Can someone please look at this commit ids? > > From digging into this, I can't spot an obvious bug in the commit above. Looking again with fresh eyes, there is a trivial bug after all. Could you give the patch below a spin? It works for me locally under QEMU. Sorry again about this! Thanks, Mark ---->8---- >From afb683b2ce749dca426d27f05af3ea08455a52d7 Mon Sep 17 00:00:00 2001 From: Mark Rutland <mark.rutland@xxxxxxx> Date: Tue, 6 Jul 2021 09:55:56 +0100 Subject: [PATCH] locking/atomic: sparc: fix arch_cmpxchg64_local() Anatoly reports that since commit: ff5b4f1ed580c59d ("locking/atomic: sparc: move to ARCH_ATOMIC") ... it's possible to reliably trigger an oops by running: stress-ng -v --mmap 1 -t 30s ... which results in a NULL pointer dereference in __split_huge_pmd_locked(). The underlying problem is that commit ff5b4f1ed580c59d left arch_cmpxchg64_local() defined in terms of cmpxchg_local() rather than arch_cmpxchg_local(). In <asm-generic/atomic-instrumented.h> we wrap these with macros which use identically-named variables. When cmpxchg_local() nests inside cmpxchg64_local(), this casues it to use an unitialized variable as the pointer, which can be NULL. This can also be seen in pmdp_establish(), where the compiler can generate the pointer with a `clr` instruction: 0000000000000360 <pmdp_establish>: 360: 9d e3 bf 50 save %sp, -176, %sp 364: fa 5e 80 00 ldx [ %i2 ], %i5 368: 82 10 00 1b mov %i3, %g1 36c: 84 10 20 00 clr %g2 370: c3 f0 90 1d casx [ %g2 ], %i5, %g1 374: 80 a7 40 01 cmp %i5, %g1 378: 32 6f ff fc bne,a %xcc, 368 <pmdp_establish+0x8> 37c: fa 5e 80 00 ldx [ %i2 ], %i5 380: d0 5e 20 40 ldx [ %i0 + 0x40 ], %o0 384: 96 10 00 1b mov %i3, %o3 388: 94 10 00 1d mov %i5, %o2 38c: 92 10 00 19 mov %i1, %o1 390: 7f ff ff 84 call 1a0 <__set_pmd_acct> 394: b0 10 00 1d mov %i5, %i0 398: 81 cf e0 08 return %i7 + 8 39c: 01 00 00 00 nop This patch fixes the problem by defining arch_cmpxchg64_local() in terms of arch_cmpxchg_local(), avoiding potential shadowing, and resulting in working cmpxchg64_local() and variants, e.g. 0000000000000360 <pmdp_establish>: 360: 9d e3 bf 50 save %sp, -176, %sp 364: fa 5e 80 00 ldx [ %i2 ], %i5 368: 82 10 00 1b mov %i3, %g1 36c: c3 f6 90 1d casx [ %i2 ], %i5, %g1 370: 80 a7 40 01 cmp %i5, %g1 374: 32 6f ff fd bne,a %xcc, 368 <pmdp_establish+0x8> 378: fa 5e 80 00 ldx [ %i2 ], %i5 37c: d0 5e 20 40 ldx [ %i0 + 0x40 ], %o0 380: 96 10 00 1b mov %i3, %o3 384: 94 10 00 1d mov %i5, %o2 388: 92 10 00 19 mov %i1, %o1 38c: 7f ff ff 85 call 1a0 <__set_pmd_acct> 390: b0 10 00 1d mov %i5, %i0 394: 81 cf e0 08 return %i7 + 8 398: 01 00 00 00 nop 39c: 01 00 00 00 nop Fixes: ff5b4f1ed580c59d ("locking/atomic: sparc: move to ARCH_ATOMIC") Signed-off-by: Mark Rutland <mark.rutland@xxxxxxx> Reported-by: Anatoly Pugachev <matorola@xxxxxxxxx> Cc: "David S. Miller" <davem@xxxxxxxxxxxxx> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxxxxxxxx> --- arch/sparc/include/asm/cmpxchg_64.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/sparc/include/asm/cmpxchg_64.h b/arch/sparc/include/asm/cmpxchg_64.h index 8c39a9981187..12d00a42c0a3 100644 --- a/arch/sparc/include/asm/cmpxchg_64.h +++ b/arch/sparc/include/asm/cmpxchg_64.h @@ -201,7 +201,7 @@ static inline unsigned long __cmpxchg_local(volatile void *ptr, #define arch_cmpxchg64_local(ptr, o, n) \ ({ \ BUILD_BUG_ON(sizeof(*(ptr)) != 8); \ - cmpxchg_local((ptr), (o), (n)); \ + arch_cmpxchg_local((ptr), (o), (n)); \ }) #define arch_cmpxchg64(ptr, o, n) arch_cmpxchg64_local((ptr), (o), (n)) -- 2.11.0