On Sun, Feb 19, 2023 at 08:09:07PM +0200, Mike Rapoport wrote: > On Sun, Feb 19, 2023 at 08:07:59AM +0000, Hyeonggon Yoo wrote: > > On Wed, Feb 01, 2023 at 08:06:37PM +0200, Mike Rapoport wrote: > > > Hi all, > > > > Hi Mike, I'm interested in this topic and hope to discuss this with you > > at LSF/MM/BPF. > > > > > To reduce the performance hit caused by the fragmentation of the direct > > > map, it makes sense to group and/or cache the base pages removed from the > > > direct map so that the most of base pages created during a split of a large > > > page will be consumed by users requiring PTE level mappings. > > > > How much performance difference did you see in your test when direct > > map was fragmented, or is there a way to check this difference? > > I did some benchmarks a while ago with the entire direct map forced to 2M > or 4k pages. The results I had are here: > > https://docs.google.com/spreadsheets/d/1tdD-cu8e93vnfGsTFxZ5YdaEfs2E1GELlvWNOGkJV2U/edit?usp=sharing > > Intel folks did more comprehensive testing and their results are here: > > https://lore.kernel.org/linux-mm/213b4567-46ce-f116-9cdf-bbd0c884eb3c@xxxxxxxxxxxxxxx/ Thanks! Hmm it might not be best choice to unconditionally merge 2M mappings to 1G a mapping. (maybe should be controlled via a boot parameter or something) > > > My current proposal is to have a cache of 2M pages close to the page > > > allocator and use a GFP flag to make allocation request use that cache. On > > > the free() path, the pages that are mapped at PTE level will be put into > > > that cache. > > > > I would like to discuss not only having cache layer of pages but also how > > direct map could be merged correctly and efficiently. > > > > I vaguely recall that Aaron Lu sent RFC series about this and Kirill A. > > Shutemov's feedback was to batch merge operations. [1] > > > > Also a CPA API called by the cache layer that could merge fragmented > > mappings would work for merging 4K pages to 2M [2], but won't work > > for merging 2M mappings to 1G mappings. > > One possible way is to make CPA scan all PMDs in 1G page after merging a 2M > page. Not sure how efficient would it be though. That seems to be similar to what Kirill A. Shutemov has been tried. He may have opinions about that? [3] https://lore.kernel.org/lkml/20200416213229.19174-1-kirill.shutemov@xxxxxxxxxxxxxxx > > At that time I didn't follow more discussions (e.g. execmem_alloc()) > > Maybe I'm missing some points. > > > > [1] https://lore.kernel.org/linux-mm/20220809100408.rm6ofiewtty6rvcl@box > > > > [2] https://lore.kernel.org/linux-mm/YvfLxuflw2ctHFWF@xxxxxxxxxx > > -- > Sincerely yours, > Mike.