On Sat, Jul 08, 2023 at 12:12:12PM -0700, Suren Baghdasaryan wrote: [..] > Lock VMAs of the parent process when forking a child, which prevents > concurrent page faults during fork operation and avoids this issue. > This fix can potentially regress some fork-heavy workloads. Kernel build > time did not show noticeable regression on a 56-core machine while a > stress test mapping 10000 VMAs and forking 5000 times in a tight loop > shows ~5% regression. If such fork time regression is unacceptable, > disabling CONFIG_PER_VMA_LOCK should restore its performance. Further > optimizations are possible if this regression proves to be problematic. > > --- > kernel/fork.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/kernel/fork.c b/kernel/fork.c > index b85814e614a5..d2e12b6d2b18 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -686,6 +686,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, > for_each_vma(old_vmi, mpnt) { > struct file *file; > > + vma_start_write(mpnt); > if (mpnt->vm_flags & VM_DONTCOPY) { > vm_stat_account(mm, mpnt->vm_flags, -vma_pages(mpnt)); > continue; > I don't see it mentioned in the discussion, so at a risk of ruffling feathers or looking really bad I'm going to ask: is the locking of any use if the forking process is single-threaded? The singular thread in this case is occupied executing this very code, so it can't do any op in parallel. Is there anyone else who could trigger a page fault? Are these shared with other processes? Cursory reading suggests a private copy is made here, so my guess is no. But then again, I landed here freshly from the interwebz. Or in short: if nobody can mess up the state if the forking process is single-threaded, why not check for mm_users or whatever other indicator to elide the slowdown for the (arguably) most common case? If the state can be messed up anyway, that's a shame, but short explanation how would be welcome. to illustrate (totally untested): diff --git a/kernel/fork.c b/kernel/fork.c index d2e12b6d2b18..aac6b08a0b21 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -652,6 +652,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, LIST_HEAD(uf); VMA_ITERATOR(old_vmi, oldmm, 0); VMA_ITERATOR(vmi, mm, 0); + bool singlethread = READ_ONCE(oldmm->mm_users) == 1; uprobe_start_dup_mmap(); if (mmap_write_lock_killable(oldmm)) { @@ -686,7 +687,8 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, for_each_vma(old_vmi, mpnt) { struct file *file; - vma_start_write(mpnt); + if (!singelthreaded) + vma_start_write(mpnt); if (mpnt->vm_flags & VM_DONTCOPY) { vm_stat_account(mm, mpnt->vm_flags, -vma_pages(mpnt)); continue;