On Tue, Sep 29, 2020 at 02:58:18PM -0700, Mike Kravetz wrote: > On 9/15/20 5:59 AM, Muchun Song wrote: > > Hi all, > > > > This patch series will free some vmemmap pages(struct page structures) > > associated with each hugetlbpage when preallocated to save memory. > ... > > The mapping of the first page(index 0) and the second page(index 1) is > > unchanged. The remaining 6 pages are all mapped to the same page(index > > 1). So we only need 2 pages for vmemmap area and free 6 pages to the > > buddy system to save memory. Why we can do this? Because the content > > of the remaining 7 pages are usually same except the first page. > > > > When a hugetlbpage is freed to the buddy system, we should allocate 6 > > pages for vmemmap pages and restore the previous mapping relationship. > > > > If we uses the 1G hugetlbpage, we can save 4095 pages. This is a very > > substantial gain. On our server, run some SPDK applications which will > > use 300GB hugetlbpage. With this feature enabled, we can save 4797MB > > memory. > > At a high level this seems like a reasonable optimization for hugetlb > pages. It is possible because hugetlb pages are 'special' and mostly > handled differently than pages in normal mm paths. > > The majority of the new code is hugetlb specific, so it should not be > of too much concern for the general mm code paths. I'll start looking > closer at the series. However, if someone has high level concerns please > let us know. The only 'potential' conflict I am aware of is discussion > about support of double mapping hugetlb pages. Not on x86, but architectures which have dcache coherency issues sometimes use PG_arch_1 on the subpages. I think it would be wise to map pages 1-7 read-only to catch this, as well as any future change which causes subpage bits to get set.