Re: [PATCH -mm] mm: percpu: fix incorrect size in pcpu_obj_full_size()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Feb 12, 2023 at 10:12:12PM +0800, Yafang Shao wrote:
> On Sat, Feb 11, 2023 at 6:39 AM Dennis Zhou <dennis@xxxxxxxxxx> wrote:
> >
> > Hello,
> >
> > On Fri, Feb 10, 2023 at 02:05:08PM -0800, Andrew Morton wrote:
> > > On Fri, 10 Feb 2023 15:49:47 +0000 Yafang Shao <laoar.shao@xxxxxxxxx> wrote:
> > >
> > > > The extra space which is used to store the obj_cgroup membership is only
> > > > valid when kmemcg is enabled. The kmemcg can be disabled via the kernel
> > > > parameter "cgroup.memory=nokmem" at runtime.
> > > > This helper is also used in non-memcg code, for example the tracepoint,
> > > > so we should fix it.
> > > >
> > > > It was found by code review when I was implementing bpf memory usage[1].
> > > > No real issue happens in production environment.
> > > >
> > > > ...
> > > >
> > > > --- a/mm/percpu-internal.h
> > > > +++ b/mm/percpu-internal.h
> > > > @@ -4,6 +4,7 @@
> > > >
> > > >  #include <linux/types.h>
> > > >  #include <linux/percpu.h>
> > > > +#include <linux/memcontrol.h>
> > > >
> > > >  /*
> > > >   * pcpu_block_md is the metadata block struct.
> > > > @@ -125,7 +126,8 @@ static inline size_t pcpu_obj_full_size(size_t size)
> > > >     size_t extra_size = 0;
> > > >
> > > >  #ifdef CONFIG_MEMCG_KMEM
> > > > -   extra_size += size / PCPU_MIN_ALLOC_SIZE * sizeof(struct obj_cgroup *);
> > > > +   if (!mem_cgroup_kmem_disabled())
> > > > +           extra_size += size / PCPU_MIN_ALLOC_SIZE * sizeof(struct obj_cgroup *);
> > > >  #endif
> > > >
> > > >     return size * num_possible_cpus() + extra_size;
> > >
> >
> > Sorry I've been a bit mia...
> >
> > > Seems risky at the first look - enabling kmemcg at runtime will make
> > > prior calculations based on pcpu_obj_full_size) incorrect.  But as long
> > > as this is only used for accounting I guess that's OK.
> > >
> > > What happens if we do a bunch of allocations with kmemcg enabled, then
> > > disable kmemcg then free those allocations, or some such thing.  Does
> > > the accounting end up wrong?
> > >
> >
> > For now it works correctly because of 2 things. 1 - the function is only
> > called by accounting. 2 - the free path doesn't consult
> > mem_cgroup_kmem_disabled() but consults if a memcg exists for a percpu
> > allocation. If accounting is enabled, we'd always account the additional
> > memory for the memcg accounting. If it's not enabled, then percpu is
> > well unaccounted for.
> >
> > This function probably needs to be renamed a bit more carefully so it
> > doesn't bleed outside of mm/percpu.c.
> >
> 
> Do you have any suggestions on the new name ?
> 
> > In short, I don't think this change is correct.
> 
> Could you pls be more specific ?
> 

Hmmm I got ahead of myself. I misunderstood memcg_*_enabled() vs
memcg_*_disabled(). Roman clarified it just now in [1]. I was imagining
a world where we add disabled here and then eventually enabled would
propagate here too.

Anothing that was on my mind is, should a percpu object be charged for
the memcg space even if it's not in use. I now think it's yes and then
for general accounting outside of memcg, this function is correct.

Acked-by: Dennis Zhou <dennis@xxxxxxxxxx>

Andrew, I have nothing queued. Do you mind picking this up?

[1] https://lore.kernel.org/linux-mm/20230213192922.1146370-1-roman.gushchin@xxxxxxxxx/T/#u

Thanks,
Dennis




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux