Re: [PATCH] btrfs: root memcgroup for metadata filemap_add_folio()

Christoph Hellwig <hch@xxxxxxxxxxxxx> · Tue, 1 Oct 2024 02:19:27 -0700

On Sat, Sep 28, 2024 at 02:15:56PM +0930, Qu Wenruo wrote:
> [BACKGROUND]
> The function filemap_add_folio() charges the memory cgroup,
> as we assume all page caches are accessible by user space progresses
> thus needs the cgroup accounting.
> 
> However btrfs is a special case, it has a very large metadata thanks to
> its support of data csum (by default it's 4 bytes per 4K data, and can
> be as large as 32 bytes per 4K data).
> This means btrfs has to go page cache for its metadata pages, to take
> advantage of both cache and reclaim ability of filemap.

FYI, in general reclaims for metadata work much better with a shrinker
than through the pagecache, because it can be object based and
prioritized.

> [ENHANCEMENT]
> Instead of relying on __GFP_NOFAIL to avoid charge failure, use root
> memory cgroup to attach metadata pages.
> 
> Although this needs to export the symbol mem_root_cgroup for
> CONFIG_MEMCG, or define mem_root_cgroup as NULL for !CONFIG_MEMCG.
> 
> With root memory cgroup, we directly skip the charging part, and only
> rely on __GFP_NOFAIL for the real memory allocation part.

This looks pretty ugly.  What speaks against a version of
filemap_add_folio that doesn't charge the memcg?