Re: [PATCH] mm: memcontrol: remove page_memcg()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 23, 2024 at 08:41:25AM -0700, Shakeel Butt wrote:
> On Thu, May 23, 2024 at 02:31:05PM +0100, Matthew Wilcox wrote:
> > On Tue, May 21, 2024 at 12:29:39PM -0700, Shakeel Butt wrote:
> > > On Tue, May 21, 2024 at 03:44:21PM +0100, Matthew Wilcox wrote:
> > > > The memcg should not be attached to the individual pages that make up a
> > > > vmalloc allocation.  Rather, it should be managed by the vmalloc
> > > > allocation itself.  I don't have the knowledge to poke around inside
> > > > vmalloc right now, but maybe somebody else could take that on.
> > > 
> > > Are you concerned about accessing just memcg or any field of the
> > > sub-page? There are drivers accessing fields of pages allocated through
> > > vmalloc. Some details at 3b8000ae185c ("mm/vmalloc: huge vmalloc backing
> > > pages should be split rather than compound").
> > 
> > Thanks for the pointer, and fb_deferred_io_fault() is already on my
> > hitlist for abusing struct page.
> > 
> > My primary concern is that we should track the entire allocation as a
> > single object rather than tracking each page individually.  That means
> > assigning the vmalloc allocation to a memcg rather than assigning each
> > page to a memcg.  It's a lot less overhead to increment the counter once
> > per allocation rather than once per page in the allocation!
> > 
> > But secondarily, yes, pages allocated by vmalloc probably don't need
> > any per-page state, other than tracking the vmalloc allocation they're
> > assigned to.  We'll see how that theory turns out.
> 
> I think the tricky part would be vmalloc having pages spanning multiple
> nodes which is not an issue for MEMCG_VMALLOC stat but the vmap based
> kernel stack (CONFIG_VMAP_STACK) metric NR_KERNEL_STACK_KB cares about
> that information.

Yes, we'll have to handle mod_lruvec_page_state() differently since that
stat is tracked per node.  Or we could stop tracking that stat per node.
Is it useful to track it per node?  Why is it useful to track kernel
stacks per node, but not track vmalloc allocations per node?




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux