On Sun, Feb 19, 2023 at 08:07:59AM +0000, Hyeonggon Yoo wrote: > On Wed, Feb 01, 2023 at 08:06:37PM +0200, Mike Rapoport wrote: > > Hi all, > > Hi Mike, I'm interested in this topic and hope to discuss this with you > at LSF/MM/BPF. > > > To reduce the performance hit caused by the fragmentation of the direct > > map, it makes sense to group and/or cache the base pages removed from the > > direct map so that the most of base pages created during a split of a large > > page will be consumed by users requiring PTE level mappings. > > How much performance difference did you see in your test when direct > map was fragmented, or is there a way to check this difference? I did some benchmarks a while ago with the entire direct map forced to 2M or 4k pages. The results I had are here: https://docs.google.com/spreadsheets/d/1tdD-cu8e93vnfGsTFxZ5YdaEfs2E1GELlvWNOGkJV2U/edit?usp=sharing Intel folks did more comprehensive testing and their results are here: https://lore.kernel.org/linux-mm/213b4567-46ce-f116-9cdf-bbd0c884eb3c@xxxxxxxxxxxxxxx/ > > My current proposal is to have a cache of 2M pages close to the page > > allocator and use a GFP flag to make allocation request use that cache. On > > the free() path, the pages that are mapped at PTE level will be put into > > that cache. > > I would like to discuss not only having cache layer of pages but also how > direct map could be merged correctly and efficiently. > > I vaguely recall that Aaron Lu sent RFC series about this and Kirill A. > Shutemov's feedback was to batch merge operations. [1] > > Also a CPA API called by the cache layer that could merge fragmented > mappings would work for merging 4K pages to 2M [2], but won't work > for merging 2M mappings to 1G mappings. One possible way is to make CPA scan all PMDs in 1G page after merging a 2M page. Not sure how efficient would it be though. > At that time I didn't follow more discussions (e.g. execmem_alloc()) > Maybe I'm missing some points. > > [1] https://lore.kernel.org/linux-mm/20220809100408.rm6ofiewtty6rvcl@box > > [2] https://lore.kernel.org/linux-mm/YvfLxuflw2ctHFWF@xxxxxxxxxx -- Sincerely yours, Mike.