On Mon, Aug 5, 2024 at 6:33 AM Pedro Falcato <pedro.falcato@xxxxxxxxx> wrote: > > On Sun, Aug 4, 2024 at 9:33 PM Linus Torvalds > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > > > On Sun, 4 Aug 2024 at 01:59, kernel test robot <oliver.sang@xxxxxxxxx> wrote: > > > > > > kernel test robot noticed a -4.4% regression of stress-ng.pagemove.page_remaps_per_sec on > > > commit 8be7258aad44 ("mseal: add mseal syscall") > > > > Ok, it's basically just the vma walk in can_modify_mm(): > > > > > 1.06 +0.1 1.18 perf-profile.self.cycles-pp.mas_next_slot > > > 1.50 +0.5 1.97 perf-profile.self.cycles-pp.mas_find > > > 0.00 +1.4 1.35 perf-profile.self.cycles-pp.can_modify_mm > > > 3.13 +2.0 5.13 perf-profile.self.cycles-pp.mas_walk > > > > and looks like it's two different pathways. We have __do_sys_mremap -> > > mremap_to -> do_munmap -> do_vmi_munmap -> can_modify_mm for the > > destination mapping, but we also have mremap_to() calling > > can_modify_mm() directly for the source mapping. > > > > And then do_vmi_munmap() will do it's *own* vma_find() after having > > done arch_unmap(). > > > > And do_munmap() will obviously do its own vma lookup as part of > > calling vma_to_resize(). > > > > So it looks like a large portion of this regression is because the > > mseal addition just ends up walking the vma list way too much. > > Can we rollback the upfront checks "funny business" and just call > can_modify_vma directly in relevant places? I still don't believe in > the partial mprotect/munmap "security risks" that were stated in the > mseal thread (and these operations can already fail for many other > reasons than mseal) :) > In-place check and extra loop, implemented properly, will both prevent changing to the sealed memory. However, extra loop will make attacker difficult to call munmap(0, random large-size), because if one of vma in the range is sealed, the whole operation will be no-op. > I don't mind taking a look myself, just want to make sure I'm not > stepping on anyone's toes here. > One thing that you can't walk around is that can_modify_mm must be called prior to arch_unmap, that means in-place check for the munmap is not possible. ( There are recent patch / refactor by Liam R. Howlett in this area, but I am not sure if this restriction is removed) > -- > Pedro