On Sun, Aug 4, 2024 at 9:33 PM Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > On Sun, 4 Aug 2024 at 01:59, kernel test robot <oliver.sang@xxxxxxxxx> wrote: > > > > kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on > > commit 8be7258aad44 ("mseal: add mseal syscall") > > Ok, it's basically just the vma walk in can_modify_mm(): > > > 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot > > 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find > > 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm > > 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk > > and looks like it's two different pathways. We have __do_sys_mremap -> > mremap_to -> do_munmap -> do_vmi_munmap -> can_modify_mm for the > destination mapping, but we also have mremap_to() calling > can_modify_mm() directly for the source mapping. > > And then do_vmi_munmap() will do it's *own* vma_find() after having > done arch_unmap(). > > And do_munmap() will obviously do its own vma lookup as part of > calling vma_to_resize(). > > So it looks like a large portion of this regression is because the > mseal addition just ends up walking the vma list way too much. Can we rollback the upfront checks "funny business" and just call can_modify_vma directly in relevant places? I still don't believe in the partial mprotect/munmap "security risks" that were stated in the mseal thread (and these operations can already fail for many other reasons than mseal) :) I don't mind taking a look myself, just want to make sure I'm not stepping on anyone's toes here. -- Pedro