On Tue, Aug 6, 2024 at 2:28 PM Pedro Falcato <pedro.falcato@xxxxxxxxx> wrote: > > Optimize mseal checks by removing the separate can_modify_mm() step, and > just doing checks on the individual vmas, when various operations are > themselves iterating through the tree. This provides a nice speedup. > > While I was at it, I found that is_madv_discard() was completely bogus. > Thanks for catching this! Is it possible to separate this fix out from this series and send it separately and merge first ? > Note that my series ignores arch_unmap(), which seems to generally be what we're trending towards[2]. It should > be applied on top of any powerpc vdso ->close patch to avoid regressions on the PPC architecture. No other > architecture seems to use arch_unmap. > > Note2: This series does not pass all mseal_tests on my end (test_seal_mremap_move_dontunmap_anyaddr fails twice). But the > top of Linus's tree does not pass these for me either (neither does my Arch Linux 6.10.2 kernel), > for some reason (mremap regression?). > I just sync to Linus's main and I was able to run the test (except two pkeys related test are skipped because I m on VM) > will-it-scale mmap1_process[1] -t 1 results: > > commit 3450fe2b574b4345e4296ccae395149e1a357fee: > > min:277605 max:277605 total:277605 > min:281784 max:281784 total:281784 > min:277238 max:277238 total:277238 > min:281761 max:281761 total:281761 > min:274279 max:274279 total:274279 > min:254854 max:254854 total:254854 > measurement > min:269143 max:269143 total:269143 > min:270454 max:270454 total:270454 > min:243523 max:243523 total:243523 > min:251148 max:251148 total:251148 > min:209669 max:209669 total:209669 > min:190426 max:190426 total:190426 > min:231219 max:231219 total:231219 > min:275364 max:275364 total:275364 > min:266540 max:266540 total:266540 > min:242572 max:242572 total:242572 > min:284469 max:284469 total:284469 > min:278882 max:278882 total:278882 > min:283269 max:283269 total:283269 > min:281204 max:281204 total:281204 > > After this patch set: > > min:280580 max:280580 total:280580 > min:290514 max:290514 total:290514 > min:291006 max:291006 total:291006 > min:290352 max:290352 total:290352 > min:294582 max:294582 total:294582 > min:293075 max:293075 total:293075 > measurement > min:295613 max:295613 total:295613 > min:294070 max:294070 total:294070 > min:293193 max:293193 total:293193 > min:291631 max:291631 total:291631 > min:295278 max:295278 total:295278 > min:293782 max:293782 total:293782 > min:290361 max:290361 total:290361 > min:294517 max:294517 total:294517 > min:293750 max:293750 total:293750 > min:293572 max:293572 total:293572 > min:295239 max:295239 total:295239 > min:292932 max:292932 total:292932 > min:293319 max:293319 total:293319 > min:294954 max:294954 total:294954 > > This was a Completely Unscientific test but seems to show there were around 5-10% gains on ops per second. > > [1]: mmap1_process does mmap and munmap in a loop. I didn't bother testing multithreading cases. > [2]: https://lore.kernel.org/all/87o766iehy.fsf@mail.lhotse/ > Link: https://lore.kernel.org/all/202408041602.caa0372-oliver.sang@xxxxxxxxx/ > > Pedro Falcato (7): > mm: Move can_modify_vma to mm/internal.h > mm/munmap: Replace can_modify_mm with can_modify_vma > mm/mprotect: Replace can_modify_mm with can_modify_vma > mm/mremap: Replace can_modify_mm with can_modify_vma > mseal: Fix is_madv_discard() > mseal: Replace can_modify_mm_madv with a vma variant > mm: Remove can_modify_mm() > > mm/internal.h | 30 ++++++++++++++++------ > mm/madvise.c | 13 +++------- > mm/mmap.c | 36 ++++++++++----------------- > mm/mprotect.c | 12 +++------ > mm/mremap.c | 33 ++++++------------------ > mm/mseal.c | 69 +++++++++++---------------------------------------- > 6 files changed, 63 insertions(+), 130 deletions(-) > > -- > 2.46.0 >