* Yin, Fengwei <fengwei.yin@xxxxxxxxx> [221221 20:19]: > > > On 12/22/2022 12:45 AM, Yang Shi wrote: > >> We caught two mmap1 regressions on mailine, please see the data below: > >> > >> 830b3c68c1fb1 Linux 6.1 2085 2355 2088 > >> 76dcd734eca23 Linux 6.1-rc8 2093 2082 2094 2073 2304 2088 > >> 0ba09b1733878 Revert "mm: align larger anonymous mappings on THP boundaries" 2124 2286 2086 2114 2065 2081 > >> 23393c6461422 char: tpm: Protect tpm_pm_suspend with locks 2756 2711 2689 2696 2660 2665 > >> b7b275e60bcd5 Linux 6.1-rc7 2670 2656 2720 2691 2667 > >> ... > >> 9abf2313adc1c Linux 6.1-rc1 2725 2717 2690 2691 2710 > >> 3b0e81a1cdc9a mmap: change zeroing of maple tree in __vma_adjust() 2736 2781 2748 > >> 524e00b36e8c5 mm: remove rb tree. 2747 2744 2747 > >> 0c563f1480435 proc: remove VMA rbtree use from nommu > >> d0cf3dd47f0d5 damon: convert __damon_va_three_regions to use the VMA iterator > >> 3499a13168da6 mm/mmap: use maple tree for unmapped_area{_topdown} > >> 7fdbd37da5c6f mm/mmap: use the maple tree for find_vma_prev() instead of the rbtree > >> f39af05949a42 mm: add VMA iterator > >> d4af56c5c7c67 mm: start tracking VMAs with maple tree > >> e15e06a839232 lib/test_maple_tree: add testing for maple tree 4638 4628 4502 > >> 9832fb87834e2 mm/demotion: expose memory tier details via sysfs 4625 4509 4548 > >> 4fe89d07dcc28 Linux 6.0 4385 4205 4348 4228 4504 > >> > >> > >> The first regression was between v6.0 and v6.1-rc1. The score dropped > >> from 4600 to 2700, and bisected to the patches switching from rb tree to > >> maple tree. This was reported at > >> https://lore.kernel.org/oe-lkp/202212191714.524e00b3-yujie.liu@xxxxxxxxx/ > >> Thanks for the explanation that it is an expected regression as a trade > >> off to benefit read performance. > >> > >> The second regression was between v6.1-rc7 and v6.1-rc8. The score > >> dropped from 2700 to 2100, and bisected to this "Revert "mm: align larger > >> anonymous mappings on THP boundaries"" commit. > > So it means "mm: align larger anonymous mappings on THP boundaries" > > actually improved the mmap1 benchmark? But it caused regression for > > other usecase, for example, building kernel with clang, which is a > > regression for a real life usecase. > Yes. The patch "mm: align larger anonymous mappings on THP boundaries" > can improve the mmap1 benchmark. > If the aligned VMAs cannot be merged, then they do not need to be split on freeing. This means we are just allocating a new vma, write it in the tree, removing it from the tree, free the vma. We can do this 4600 times a second, apparently. If the VMAs do get merged, we will go through __vma_adjust() to expand a boundary, write it to the tree, allocate a new vma, __vma_adjust() the vma boundary back, insert the new VMA that covers the boundary area, remove the new vma from the tree, free the vma. We can only do this 2700 times a second. Note this is writing 3 times to the tree in this loop vs 2 in the other option. So yes, merging/splitting is more work and always has been. We are doing this to avoid having too many VMAs though. There really isn't a good reason an application would do this for any meaningful number of iterations. > For building kernel regression, looks like it's not related with the > patch "mm: align larger anonymous mappings on THP boundaries" directly. > It's another existing behavior more visible with the patch. > https://lore.kernel.org/all/a4bcddad-e56f-cedc-891a-916e86d9a02c@xxxxxxxxx/ > Having a snapshot of the VMA layout would help here since the THP boundary alignment may be changing if the VMAs can be merged or not. I suspect it is not able to merge and is fragmenting the VMA space which would speed up this benchmark at the expense of having more VMAs. Thanks, Liam