On Mon, Feb 20, 2023 at 02:43:03PM +0000, Hyeonggon Yoo wrote: > On Sun, Feb 19, 2023 at 08:09:07PM +0200, Mike Rapoport wrote: > > On Sun, Feb 19, 2023 at 08:07:59AM +0000, Hyeonggon Yoo wrote: > > > > > My current proposal is to have a cache of 2M pages close to the page > > > > allocator and use a GFP flag to make allocation request use that cache. On > > > > the free() path, the pages that are mapped at PTE level will be put into > > > > that cache. > > > > > > I would like to discuss not only having cache layer of pages but also how > > > direct map could be merged correctly and efficiently. > > > > > > I vaguely recall that Aaron Lu sent RFC series about this and Kirill A. > > > Shutemov's feedback was to batch merge operations. [1] > > > > > > Also a CPA API called by the cache layer that could merge fragmented > > > mappings would work for merging 4K pages to 2M [2], but won't work > > > for merging 2M mappings to 1G mappings. > > > > One possible way is to make CPA scan all PMDs in 1G page after merging a 2M > > page. Not sure how efficient would it be though. > > That seems to be similar to what Kirill A. Shutemov has been tried. > He may have opinions about that? > > [3] https://lore.kernel.org/lkml/20200416213229.19174-1-kirill.shutemov@xxxxxxxxxxxxxxx Kirill's patch attempted to restore 1G page for each cpa_flush(), so it scanned a lot of page tables without a guarantee that collapsing small mappings to a large page is possible. If we call a function that will collapse a 2M when we know for sure that the collapse is possible, it will be more efficient. -- Sincerely yours, Mike.