> On Jul 27, 2023, at 00:52, Alexander Potapenko <glider@xxxxxxxxxx> wrote: > > On Tue, Jul 25, 2023 at 6:21 PM Alexander Potapenko <glider@xxxxxxxxxx> wrote: >> >> On Tue, Jul 25, 2023 at 3:39 PM Naresh Kamboju >> <naresh.kamboju@xxxxxxxxxx> wrote: >>> >>> On Tue, 25 Jul 2023 at 17:22, Alexander Potapenko <glider@xxxxxxxxxx> wrote: >>>> >>>> On Tue, Jul 25, 2023 at 11:59 AM Alexander Potapenko <glider@xxxxxxxxxx> wrote: >>>>> >>>>> On Mon, Jul 24, 2023 at 2:10 PM Naresh Kamboju >>>>> <naresh.kamboju@xxxxxxxxxx> wrote: >>>>>> >>>>>> On Mon, 24 Jul 2023 at 15:50, Alexander Potapenko <glider@xxxxxxxxxx> wrote: >>>>>>> >>>>>>> On Sat, Jul 22, 2023 at 6:37 PM Linus Torvalds >>>>>>> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: >>>>>>>> >>>>>>>> [ Removed the stable reviewers, bringing in the kfence people ] >>>>>>>> >>>>>>>> See >>>>>>>> >>>>>>>> https://lore.kernel.org/lkml/CA+G9fYvgy22wiY=c3wLOrCM6o33636abhtEynXhJkqxJh4ca0A@xxxxxxxxxxxxxx/ >>>>>>>> >>>>>>>> for the original report. The warning was introduced in 8f0b36497303 >>>>>>>> ("mm: kfence: fix objcgs vector allocation"), and Google doesn't find >>>>>>>> any other cases of this. >>>>>>>> >>>>>>>> Anybody? >>>>>>>> >>>>>>>> Linus >>>>>>>> >>>>>>> >> >> Muchun, any chance you know under what circumstances a KFENCE object >> has its meta->objcg set to a non-NULL value? >> It seems to be a quite rare case, and I've only seen it in live >> radix_tree_node objects. >> Since the check here: >> https://elixir.bootlin.com/linux/latest/source/mm/kfence/core.c#L1097 >> ensures that this value is NULL when the object is freed, where is the >> code that is supposed to zero it? >> Could there be a race somewhere? > > > I am still puzzled about what is going on. > > As far as I can see, when KFENCE pool is initialized, for ith object > page in the pool its page_slab()->memcg_data is set to a value derived > from kfence_metadata[i].objcg > Because KFENCE objects always occupy one page, no two objects are > expected to share memcg_data at any time. > > When slab_alloc_node() is called, it first invokes > slab_pre_alloc_hook(), figures out the obj_cgroup and charges it for > the allocated memory. The obj_cgroup is returned to slab_alloc_node() > and after KFENCE allocation succeeds is passed to > slab_post_alloc_hook(), which then writes obj_cgroup to > *(page_slab(object)->memcg_data). > > When an object is deallocated, slab_free() calls > memcg_slab_free_hook(), which zeroes *(page_slab(object)->memcg_data) > and passes the object to kfence_free(). > At this point the object's meta->objcg must be NULL, so the warning > should not be firing. At least, totally agree. This call stack comes from slab_free() which makes sure memcg_slab_free_hook() is called before kfence_free(), so meta->objcg must be NULL. Otherwise, seems something is corrupted. So I really want to know what's the value of "meta->objcg" when the warning is firing (e.g. whether it is a valid pointer or does the last bit is set with MEMCG_DATA_OBJCGS). Maybe we could improve the warning message, e.g. print the current value of "meta->objcg". Thanks.