Re: [PATCH 6.10 000/809] 6.10.3-rc3 review

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/8/24 03:07, Guenter Roeck wrote:
> On 8/6/24 16:24, Thomas Gleixner wrote:
>> Cc+: Helge, parisc ML
>> 
>> We're chasing a weird failure which has been tracked down to the
>> placement of the division library functions (I assume they are imported
>> from libgcc).
>> 
>> See the thread starting at:
>> 
>>    https://lore.kernel.org/all/718b8afe-222f-4b3a-96d3-93af0e4ceff1@xxxxxxxxxxxx
>> 
>> On Tue, Aug 06 2024 at 21:25, Vlastimil Babka wrote:
>>> On 8/6/24 19:33, Thomas Gleixner wrote:
>>>>
>>>> So this change adds 16 bytes to __softirq() which moves the division
>>>> functions up by 16 bytes. That's all it takes to make the stupid go
>>>> away....
>>>
>>> Heh I was actually wondering if the division is somhow messed up because
>>> maxobj = order_objects() and order_objects() does a division. Now I suspect
>>> it even more.
>> 
>> check_slab() calls into that muck, but I checked the disassembly of a
>> working and a broken kernel and the only difference there is the
>> displacement offset when the code calculates the call address, but
>> that's as expected a difference of 16 bytes.
>> 
>> Now it becomes interesting.
>> 
>> I added a unused function after __do_softirq() into the softirq text
>> section and filled it with ASM nonsense so that it occupies exactly one
>> page. That moves $$divoI, which is what check_slab() calls, exactly one
>> page forward:
>> 
> 
> With the above added to my tree, I can also play around with the code.
> Here is the next weird one:
> 
> diff --git a/mm/slub.c b/mm/slub.c
> index 4927edec6a8c..b8a33966d858 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1385,6 +1385,9 @@ static int check_slab(struct kmem_cache *s, struct slab *slab)
>          }
> 
>          maxobj = order_objects(slab_order(slab), s->size);
> +
> +       pr_info_once("##### slab->objects=%u maxobj=%u\n", slab->objects, maxobj);
> +
>          if (slab->objects > maxobj) {
>                  slab_err(s, slab, "objects %u > max %u",
>                          slab->objects, maxobj);
> 
> results in:
> 
> ##### slab->objects=21 maxobj=21
> =============================================================================
> BUG kmem_cache_node (Not tainted): objects 21 > max 16

But is this printed from the same attempt? The pr_info_once() might have
printed earlier and then stopped (as it's _once) and the error case might
have happened only later, and there was nothing printed in between as the
kmalloc caches are created in a loop.

> As Thomas noticed, this only happens if the divide assembler code is within a certain
> address range.
> 
> Ok, now I am really lost.
> 
> Guenter
> 





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux