On Thu, 8 Aug 2024 at 02:57, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote: > > Careful vs. the pr_once(). It's not necessarily the first allocation > which trips up. I removed slab_err() in that condition and just printed > the data: > > [ 0.000000] Order: 1 Size: 384 Nobj: 21 Maxobj: 16 21 Inuse: 14 > [ 0.000000] Order: 0 Size: 168 Nobj: 24 Maxobj: 16 24 Inuse: 1 > [ 0.000000] Order: 1 Size: 320 Nobj: 25 Maxobj: 16 25 Inuse: 18 > [ 0.000000] Order: 1 Size: 320 Nobj: 25 Maxobj: 16 25 Inuse: 19 > [ 0.000000] Order: 1 Size: 320 Nobj: 25 Maxobj: 16 25 Inuse: 20 > [ 0.000000] Order: 0 Size: 160 Nobj: 25 Maxobj: 16 25 Inuse: 5 > [ 0.000000] Order: 2 Size: 672 Nobj: 24 Maxobj: 16 24 Inuse: 1 > [ 0.000000] Order: 3 Size: 1536 Nobj: 21 Maxobj: 16 21 Inuse: 1 > [ 0.000000] Order: 3 Size: 1536 Nobj: 21 Maxobj: 16 21 Inuse: 2 > [ 0.000000] Order: 3 Size: 1536 Nobj: 21 Maxobj: 16 21 Inuse: 10 > > The maxobj column shows the failed result and the result from the second > invocation inside of the printk(). Hmm. There's a few patterns there: - the incorrect Maxobj is always 16, with wildly different sizes. - the correct value is always in that 21-25 range and neither of these are particularly common cases for slab objects (well, at least on x86-64). I actually went into the gcc sources to look at the libgcc routines for the hppa $$divU routine, but apart from checking for trivial powers-of-two and for divisions with small divisor values (<=17), all it is ends up being a series of "ds" (divide step) and "addc" instructions. I don't see how that could possibly mess up. It does end up with the final addc in the delay slot of the return, but that's normal parisc behavior (and here by "normal" I mean "it's a really messed up instruction set that did everything wrong, including branch delay slots") I do note that the $$divU function (which is what this all should use) oddly doesn't show up as defined in 'nm' for me when I look at Guenter's vmlinux file. So there's some odd linker thing going on, and it *only* affects the $$div* functions. Thomas' System.map shows some of the same effects, ie it shows $$divoI (signed integer divide with overflow checking), but doesn't show $$divU that is right after it. The reason I was looking was exactly because this should be using $$divU, and clearly code alignment is implicated somehow, but the exact alignment of $$divU wasn't obvious. But it looks like "$$divU" should be somewhere between $$divoI and $$divl_2, and in Guenter's bad case that's 0000000041218c70 T $$divoI 00000000412190d0 T $$divI_2 so *maybe* $$divU is around a page boundary? 0000000041218xxx turning into 0000000041219000? Some ITLB fill issue together with that delayed branch and a qemu bug? Linus