Re: [RFC PATCH 00/24] mm/hugetlb: Free some vmemmap pages of hugetlb page

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Wed, 30 Sep 2020 04:13:03 +0100

On Tue, Sep 29, 2020 at 02:58:18PM -0700, Mike Kravetz wrote:
> On 9/15/20 5:59 AM, Muchun Song wrote:
> > Hi all,
> > 
> > This patch series will free some vmemmap pages(struct page structures)
> > associated with each hugetlbpage when preallocated to save memory.
> ...
> > The mapping of the first page(index 0) and the second page(index 1) is
> > unchanged. The remaining 6 pages are all mapped to the same page(index
> > 1). So we only need 2 pages for vmemmap area and free 6 pages to the
> > buddy system to save memory. Why we can do this? Because the content
> > of the remaining 7 pages are usually same except the first page.
> > 
> > When a hugetlbpage is freed to the buddy system, we should allocate 6
> > pages for vmemmap pages and restore the previous mapping relationship.
> > 
> > If we uses the 1G hugetlbpage, we can save 4095 pages. This is a very
> > substantial gain. On our server, run some SPDK applications which will
> > use 300GB hugetlbpage. With this feature enabled, we can save 4797MB
> > memory.
> 
> At a high level this seems like a reasonable optimization for hugetlb
> pages.  It is possible because hugetlb pages are 'special' and mostly
> handled differently than pages in normal mm paths.
> 
> The majority of the new code is hugetlb specific, so it should not be
> of too much concern for the general mm code paths.  I'll start looking
> closer at the series.  However, if someone has high level concerns please
> let us know.  The only 'potential' conflict I am aware of is discussion
> about support of double mapping hugetlb pages.

Not on x86, but architectures which have dcache coherency issues sometimes
use PG_arch_1 on the subpages.  I think it would be wise to map pages
1-7 read-only to catch this, as well as any future change which causes
subpage bits to get set.