On Thu, Oct 17, 2024 at 01:41:15PM -0400, Liam R. Howlett wrote: > * Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> [241017 10:31]: > > We know in advance that do_brk_flags() wants only to perform a VMA > > expansion (if the prior VMA is compatible), and that we assume no mergeable > > VMA follows it. > > > > These are the semantics of this function prior to the recent rewrite of the > > VMA merging logic, however we are now doing more work than necessary - > > positioning the VMA iterator at the prior VMA and performing tasks that are > > not required. > > > > Add a new field to the vmg struct to permit merge flags and add a new merge > > flag VMG_FLAG_JUST_EXPAND which implies this behaviour, and have > > do_brk_flags() use this. > > Funny, I was thinking we could do this for relocate_vma_down() as well, > bu that's expanding in the wrong direction so we'd have to add > VMG_FLAG_JUST_EXPAND_DOWN, or the like. I'm not sure it's worth doing > since the expand down happens a lot less often. Yeah I think we have to carefully profile these and go case-by-case. My initial series to fix this made things worse :P as usual in perf developer instinct (in this case mine) is often miles off. > > > > > This fixes a reported performance regression in a brk() benchmarking suite. > > > > Reported-by: kernel test robot <oliver.sang@xxxxxxxxx> > > Closes: https://lore.kernel.org/linux-mm/202409301043.629bea78-oliver.sang@xxxxxxxxx > > Fixes: cacded5e42b9 ("mm: avoid using vma_merge() for new VMAs") > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> > > Reviewed-by: Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> > > > --- > > mm/mmap.c | 3 ++- > > mm/vma.c | 23 +++++++++++++++-------- > > mm/vma.h | 14 ++++++++++++++ > > 3 files changed, 31 insertions(+), 9 deletions(-) > > > > diff --git a/mm/mmap.c b/mm/mmap.c > > index dd4b35a25aeb..4a13d9f138f6 100644 > > --- a/mm/mmap.c > > +++ b/mm/mmap.c > > @@ -1744,7 +1744,8 @@ static int do_brk_flags(struct vma_iterator *vmi, struct vm_area_struct *vma, > > VMG_STATE(vmg, mm, vmi, addr, addr + len, flags, PHYS_PFN(addr)); > > > > vmg.prev = vma; > > - vma_iter_next_range(vmi); > > + /* vmi is positioned at prev, which this mode expects. */ > > + vmg.merge_flags = VMG_FLAG_JUST_EXPAND; > > > > if (vma_merge_new_range(&vmg)) > > goto out; > > diff --git a/mm/vma.c b/mm/vma.c > > index 4737afcb064c..b21ffec33f8e 100644 > > --- a/mm/vma.c > > +++ b/mm/vma.c > > @@ -917,6 +917,7 @@ struct vm_area_struct *vma_merge_new_range(struct vma_merge_struct *vmg) > > pgoff_t pgoff = vmg->pgoff; > > pgoff_t pglen = PHYS_PFN(end - start); > > bool can_merge_left, can_merge_right; > > + bool just_expand = vmg->merge_flags & VMG_FLAG_JUST_EXPAND; > > > > mmap_assert_write_locked(vmg->mm); > > Could we detect the wrong location by ensuring that the vma iterator is > positioned at prev? > > VM_WARN_ON(just_expand && prev && prev->vm_end != vma_iter_end(vmg->vmi); > > or, since vma_iter_addr is used above already.. > > VM_WARN_ON(just_expand && prev && prev->start != vma_iter_addr(vmg->vmi); > > > Does it make sense to just expand without a prev? Should that be > checked separately? > > Anyways, I think it's safer to keep these checks out of the regression > fix, but maybe better to add them later? Yeah I do think it'd be worth adding some specific ones. But yeah perhaps let's add those later! > > > > VM_WARN_ON(vmg->vma); > > @@ -930,7 +931,7 @@ struct vm_area_struct *vma_merge_new_range(struct vma_merge_struct *vmg) > > return NULL; > > > > can_merge_left = can_vma_merge_left(vmg); > > - can_merge_right = can_vma_merge_right(vmg, can_merge_left); > > + can_merge_right = !just_expand && can_vma_merge_right(vmg, can_merge_left); > > > > /* If we can merge with the next VMA, adjust vmg accordingly. */ > > if (can_merge_right) { > > @@ -953,7 +954,11 @@ struct vm_area_struct *vma_merge_new_range(struct vma_merge_struct *vmg) > > if (can_merge_right && !can_merge_remove_vma(next)) > > vmg->end = end; > > > > - vma_prev(vmg->vmi); /* Equivalent to going to the previous range */ > > + /* In expand-only case we are already positioned at prev. */ > > + if (!just_expand) { > > + /* Equivalent to going to the previous range. */ > > + vma_prev(vmg->vmi); > > + } > > } > > > > /* > > @@ -967,12 +972,14 @@ struct vm_area_struct *vma_merge_new_range(struct vma_merge_struct *vmg) > > } > > > > /* If expansion failed, reset state. Allows us to retry merge later. */ > > - vmg->vma = NULL; > > - vmg->start = start; > > - vmg->end = end; > > - vmg->pgoff = pgoff; > > - if (vmg->vma == prev) > > - vma_iter_set(vmg->vmi, start); > > + if (!just_expand) { > > + vmg->vma = NULL; > > + vmg->start = start; > > + vmg->end = end; > > + vmg->pgoff = pgoff; > > + if (vmg->vma == prev) > > + vma_iter_set(vmg->vmi, start); > > + } > > > > return NULL; > > } > > diff --git a/mm/vma.h b/mm/vma.h > > index 819f994cf727..8c6ecc0dfbf6 100644 > > --- a/mm/vma.h > > +++ b/mm/vma.h > > @@ -59,6 +59,17 @@ enum vma_merge_state { > > VMA_MERGE_SUCCESS, > > }; > > > > +enum vma_merge_flags { > > + VMG_FLAG_DEFAULT = 0, > > + /* > > + * If we can expand, simply do so. We know there is nothing to merge to > > + * the right. Does not reset state upon failure to merge. The VMA > > + * iterator is assumed to be positioned at the previous VMA, rather than > > + * at the gap. > > + */ > > + VMG_FLAG_JUST_EXPAND = 1 << 0, > > +}; > > + > > /* Represents a VMA merge operation. */ > > struct vma_merge_struct { > > struct mm_struct *mm; > > @@ -75,6 +86,7 @@ struct vma_merge_struct { > > struct mempolicy *policy; > > struct vm_userfaultfd_ctx uffd_ctx; > > struct anon_vma_name *anon_name; > > + enum vma_merge_flags merge_flags; > > enum vma_merge_state state; > > }; > > > > @@ -99,6 +111,7 @@ static inline pgoff_t vma_pgoff_offset(struct vm_area_struct *vma, > > .flags = flags_, \ > > .pgoff = pgoff_, \ > > .state = VMA_MERGE_START, \ > > + .merge_flags = VMG_FLAG_DEFAULT, \ > > } > > > > #define VMG_VMA_STATE(name, vmi_, prev_, vma_, start_, end_) \ > > @@ -118,6 +131,7 @@ static inline pgoff_t vma_pgoff_offset(struct vm_area_struct *vma, > > .uffd_ctx = vma_->vm_userfaultfd_ctx, \ > > .anon_name = anon_vma_name(vma_), \ > > .state = VMA_MERGE_START, \ > > + .merge_flags = VMG_FLAG_DEFAULT, \ > > } > > > > #ifdef CONFIG_DEBUG_VM_MAPLE_TREE > > -- > > 2.46.2