Re: [LSF/MM/BPF TOPIC] reducing direct map fragmentation

Mike Rapoport <rppt@xxxxxxxxxx> · Sun, 19 Feb 2023 20:09:07 +0200

On Sun, Feb 19, 2023 at 08:07:59AM +0000, Hyeonggon Yoo wrote:
> On Wed, Feb 01, 2023 at 08:06:37PM +0200, Mike Rapoport wrote:
> > Hi all,
> 
> Hi Mike, I'm interested in this topic and hope to discuss this with you
> at LSF/MM/BPF.
>  
> > To reduce the performance hit caused by the fragmentation of the direct
> > map, it makes sense to group and/or cache the base pages removed from the
> > direct map so that the most of base pages created during a split of a large
> > page will be consumed by users requiring PTE level mappings.
> 
> How much performance difference did you see in your test when direct
> map was fragmented, or is there a way to check this difference? 

I did some benchmarks a while ago with the entire direct map forced to 2M
or 4k pages. The results I had are here:

https://docs.google.com/spreadsheets/d/1tdD-cu8e93vnfGsTFxZ5YdaEfs2E1GELlvWNOGkJV2U/edit?usp=sharing

Intel folks did more comprehensive testing and their results are here:

https://lore.kernel.org/linux-mm/213b4567-46ce-f116-9cdf-bbd0c884eb3c@xxxxxxxxxxxxxxx/

> > My current proposal is to have a cache of 2M pages close to the page
> > allocator and use a GFP flag to make allocation request use that cache. On
> > the free() path, the pages that are mapped at PTE level will be put into
> > that cache.
> 
> I would like to discuss not only having cache layer of pages but also how
> direct map could be merged correctly and efficiently.
> 
> I vaguely recall that Aaron Lu sent RFC series about this and Kirill A.
> Shutemov's feedback was to batch merge operations. [1]
> 
> Also a CPA API called by the cache layer that could merge fragmented
> mappings would work for merging 4K pages to 2M [2], but won't work
> for merging 2M mappings to 1G mappings.

One possible way is to make CPA scan all PMDs in 1G page after merging a 2M
page. Not sure how efficient would it be though.

> At that time I didn't follow more discussions (e.g. execmem_alloc())
> Maybe I'm missing some points.
> 
> [1] https://lore.kernel.org/linux-mm/20220809100408.rm6ofiewtty6rvcl@box
> 
> [2] https://lore.kernel.org/linux-mm/YvfLxuflw2ctHFWF@xxxxxxxxxx

-- 
Sincerely yours,
Mike.