Re: [PATCH 6.10 000/809] 6.10.3-rc3 review

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/6/24 10:49, Linus Torvalds wrote:
[ Adding s390 people, this is strange ]


Did I get lost somewhere ? I am seeing this with parisc (64 bit), not s390.

Thanks,
Guenter

New people, see

   https://lore.kernel.org/all/CAHk-=wjmumbT73xLkSAnnxDwaFE__Ny=QCp6B_LE2aG1SUqiTg@xxxxxxxxxxxxxx/

for context. There's a heisenbug that depends on random code layout
issues on s390.

On Tue, 6 Aug 2024 at 10:34, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

Hmm. Do we have some alignment confusion?

The alignment rules for 192 are to align it to 64-byte boundaries
(because that's the largest power of two that divides it), and that
means it stays at 192, and that would give 21 objects per 4kB page.

But if we use the "align up to next power of two", you get 256 bytes,
and 16 objects per page.

And that 21-vs-16 confusion would seem to match this pretty well:

   [    0.000000] BUG kmem_cache_node (Not tainted): objects 21 > max 16

which makes me wonder...

I'd suspect commit ad59baa31695 ("slab, rust: extend kmalloc()
alignment guarantees to remove Rust padding"), perhaps with some odd
s390 code generation issue for 'ffs()'.

IOW, this new code in mm/slab_common.c

         if (flags & SLAB_KMALLOC)
                 align = max(align, 1U << (ffs(size) - 1));

might not match some other alignment code.

Or maybe it's the s390 ffs().

It looks like

   static inline int ffs(int word)
   {
         unsigned long mask = 2 * BITS_PER_LONG - 1;
         unsigned int val = (unsigned int)word;

         return (1 + (__flogr(-val & val) ^ (BITS_PER_LONG - 1))) & mask;
   }

where s390 has this very odd "flogr" instruction ("find last one G
register"?) for the non-constant case.

That uses a "union register_pair" but only ever uses the "even"
register without ever using the full 128-bit part or the odd register.
So the other register in the register pair is uninitialized.

Does that cause random compiler issues based on register allocation?

Just for fun, does something like this make any difference?

   --- a/arch/s390/include/asm/bitops.h
   +++ b/arch/s390/include/asm/bitops.h
   @@ -305,6 +305,7 @@ static inline unsigned char __flogr(unsigned long word)
                 union register_pair rp;

                 rp.even = word;
   +             rp.odd = 0;
                 asm volatile(
                         "       flogr   %[rp],%[rp]\n"
                         : [rp] "+d" (rp.pair) : : "cc");


Thomas notices that the special "div by constant" routines moved
around, and I'm not seeing how *that* would matter, but it's all
obviously very strange.

               Linus





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux