Re: [PATCH] mm: memcontrol: remove page_memcg()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 23, 2024 at 05:34:27PM GMT, Matthew Wilcox wrote:
> On Thu, May 23, 2024 at 08:41:25AM -0700, Shakeel Butt wrote:
> > On Thu, May 23, 2024 at 02:31:05PM +0100, Matthew Wilcox wrote:
> > > On Tue, May 21, 2024 at 12:29:39PM -0700, Shakeel Butt wrote:
> > > > On Tue, May 21, 2024 at 03:44:21PM +0100, Matthew Wilcox wrote:
> > > > > The memcg should not be attached to the individual pages that make up a
> > > > > vmalloc allocation.  Rather, it should be managed by the vmalloc
> > > > > allocation itself.  I don't have the knowledge to poke around inside
> > > > > vmalloc right now, but maybe somebody else could take that on.
> > > > 
> > > > Are you concerned about accessing just memcg or any field of the
> > > > sub-page? There are drivers accessing fields of pages allocated through
> > > > vmalloc. Some details at 3b8000ae185c ("mm/vmalloc: huge vmalloc backing
> > > > pages should be split rather than compound").
> > > 
> > > Thanks for the pointer, and fb_deferred_io_fault() is already on my
> > > hitlist for abusing struct page.
> > > 
> > > My primary concern is that we should track the entire allocation as a
> > > single object rather than tracking each page individually.  That means
> > > assigning the vmalloc allocation to a memcg rather than assigning each
> > > page to a memcg.  It's a lot less overhead to increment the counter once
> > > per allocation rather than once per page in the allocation!
> > > 
> > > But secondarily, yes, pages allocated by vmalloc probably don't need
> > > any per-page state, other than tracking the vmalloc allocation they're
> > > assigned to.  We'll see how that theory turns out.
> > 
> > I think the tricky part would be vmalloc having pages spanning multiple
> > nodes which is not an issue for MEMCG_VMALLOC stat but the vmap based
> > kernel stack (CONFIG_VMAP_STACK) metric NR_KERNEL_STACK_KB cares about
> > that information.
> 
> Yes, we'll have to handle mod_lruvec_page_state() differently since that
> stat is tracked per node.  Or we could stop tracking that stat per node.
> Is it useful to track it per node?  Why is it useful to track kernel
> stacks per node, but not track vmalloc allocations per node?

This is a good question and other than that there are user visible APIs
(per numa meminfo & memory.numa_stat), I don't have a good answer.




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux