Re: [PATCH 6.4 000/292] 6.4.5-rc1 review

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Jul 27, 2023, at 15:02, Muchun Song <muchun.song@xxxxxxxxx> wrote:
> 
> 
> 
>> On Jul 27, 2023, at 00:52, Alexander Potapenko <glider@xxxxxxxxxx> wrote:
>> 
>> On Tue, Jul 25, 2023 at 6:21 PM Alexander Potapenko <glider@xxxxxxxxxx> wrote:
>>> 
>>> On Tue, Jul 25, 2023 at 3:39 PM Naresh Kamboju
>>> <naresh.kamboju@xxxxxxxxxx> wrote:
>>>> 
>>>> On Tue, 25 Jul 2023 at 17:22, Alexander Potapenko <glider@xxxxxxxxxx> wrote:
>>>>> 
>>>>> On Tue, Jul 25, 2023 at 11:59 AM Alexander Potapenko <glider@xxxxxxxxxx> wrote:
>>>>>> 
>>>>>> On Mon, Jul 24, 2023 at 2:10 PM Naresh Kamboju
>>>>>> <naresh.kamboju@xxxxxxxxxx> wrote:
>>>>>>> 
>>>>>>> On Mon, 24 Jul 2023 at 15:50, Alexander Potapenko <glider@xxxxxxxxxx> wrote:
>>>>>>>> 
>>>>>>>> On Sat, Jul 22, 2023 at 6:37 PM Linus Torvalds
>>>>>>>> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>>>>>>>>> 
>>>>>>>>> [ Removed the stable reviewers, bringing in the kfence people ]
>>>>>>>>> 
>>>>>>>>> See
>>>>>>>>> 
>>>>>>>>> https://lore.kernel.org/lkml/CA+G9fYvgy22wiY=c3wLOrCM6o33636abhtEynXhJkqxJh4ca0A@xxxxxxxxxxxxxx/
>>>>>>>>> 
>>>>>>>>> for the original report. The warning was introduced in 8f0b36497303
>>>>>>>>> ("mm: kfence: fix objcgs vector allocation"), and Google doesn't find
>>>>>>>>> any other cases of this.
>>>>>>>>> 
>>>>>>>>> Anybody?
>>>>>>>>> 
>>>>>>>>>                   Linus
>>>>>>>>> 
>>>>>>>> 
>>> 
>>> Muchun, any chance you know under what circumstances a KFENCE object
>>> has its meta->objcg set to a non-NULL value?
>>> It seems to be a quite rare case, and I've only seen it in live
>>> radix_tree_node objects.
>>> Since the check here:
>>> https://elixir.bootlin.com/linux/latest/source/mm/kfence/core.c#L1097
>>> ensures that this value is NULL when the object is freed, where is the
>>> code that is supposed to zero it?
>>> Could there be a race somewhere?
>> 
>> 
>> I am still puzzled about what is going on.
>> 
>> As far as I can see, when KFENCE pool is initialized, for ith object
>> page in the pool its page_slab()->memcg_data is set to a value derived
>> from kfence_metadata[i].objcg
>> Because KFENCE objects always occupy one page, no two objects are
>> expected to share memcg_data at any time.
>> 
>> When slab_alloc_node() is called, it first invokes
>> slab_pre_alloc_hook(), figures out the obj_cgroup and charges it for
>> the allocated memory. The obj_cgroup is returned to slab_alloc_node()
>> and after KFENCE allocation succeeds is passed to
>> slab_post_alloc_hook(), which then writes obj_cgroup to
>> *(page_slab(object)->memcg_data).
>> 
>> When an object is deallocated, slab_free() calls
>> memcg_slab_free_hook(), which zeroes *(page_slab(object)->memcg_data)
>> and passes the object to kfence_free().
>> At this point the object's meta->objcg must be NULL, so the warning
>> should not be firing.
> 
> At least, totally agree. This call  stack comes from slab_free() which
> makes sure memcg_slab_free_hook() is called before kfence_free(), so
> meta->objcg must be NULL. Otherwise, seems something is corrupted. So
> I really want to know what's the value of "meta->objcg" when the warning
> is firing (e.g. whether it is a valid pointer or does the last bit is
> set with MEMCG_DATA_OBJCGS). Maybe we could improve the warning message,

Sorry for the confusing, meta->objcg should be a objcg pointer, it
cannot be set with MEMCG_DATA_OBJCGS.

> e.g. print the current value of "meta->objcg".
> 
> Thanks.







[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux