Re: [PATCH 2/2] vmalloc: Account memcg per vmalloc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 11, 2024 at 08:20:36PM +0000, Matthew Wilcox wrote:
> On Wed, Dec 11, 2024 at 11:32:13AM -0800, Shakeel Butt wrote:
> > On Wed, Dec 11, 2024 at 04:50:39PM +0000, Matthew Wilcox wrote:
> > > Perhaps you'd be more persuaded by:
> > > 
> > > (a) If we clear __GFP_ACCOUNT then alloc_pages_bulk() will work, and
> > > that's a pretty significant performance win over calling alloc_pages()
> > > in a loop.
> > > 
> > > (b) Once we get to memdescs, calling alloc_pages() with __GFP_ACCOUNT
> > > set is going to require allocating a memdesc to store the obj_cgroup
> > > in, so in the future we'll save an allocation.
> > > 
> > > Your proposed alternative will work and is way less churn.  But it's
> > > not preparing us for memdescs ;-)
> > 
> > We can make alloc_pages_bulk() work with __GFP_ACCOUNT but your second
> > argument is more compelling.
> > 
> > I am trying to think of what will we miss if we remove this per-page
> > memcg metadata. One thing I can think of is debugging a live system
> > or kdump where I need to track where a given page came from. I think
> 
> Umm, I don't think you know which vmalloc allocation a page came from
> today?  I've sent patches to add that information before, but they were
> rejected. 

Do you have a link handy for that discussion?

> In fact, I don't think we know even _that_ a page belongs to
> vmalloc today, do we?  Yes, we know that the page is accounted, and
> which memcg it belongs to ... but nothing more.

Yes you are correct. At the moment it is a guesswork and exhaustive
search into multiple sources.

> 
> I actually want to improve this, without adding additional overhead.
> What I'm working on right now (before I got waylaid by this bug) is:
> 
> +struct choir {
> +       struct kref refcount;
> +       unsigned int nr;
> +       struct page *pages[] __counted_by(nr);
> +};
> 
> and rewriting vmalloc to be based on choirs instead of its own pages.
> One thing I've come to realise today is that the obj_cgroup pointer
> needs to be in the choir and not in the vm_struct so that we uncharge the
> allocation when the choir refcount drops to 0, not when the allocation
> is unmapped.

What/who else can take a reference on a choir?

> 
> A regular choir allocation will (today) mark the pages in it as being
> allocated to a choir (and thus not having their own refcount / mapcount),
> but I'll give vmalloc a way to mark the pages as specifically being
> from vmalloc.

This sounds good. Thanks for the awesome work.




[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux