On Wed, Oct 23, 2024 at 05:01:29PM +0200, Vlastimil Babka wrote: > On 10/22/24 22:40, Lorenzo Stoakes wrote: > > Previously, we'd always try to merge a file-backed VMA if its flags were > > changed by the driver. > > > > This however is rarely meaningful as typically the flags would be changed > > to VM_PFNMAP or other VM_SPECIAL flags which are inherently unmergable. > > > > In cases where it is meaningful (for instance DAX) it is doubtful that this > > Hm if that's true, I'm imagining many piecemeal mmap()s of DAX that used to > merge but now will create tons of VMA's, which doesn't sound great. Then it > has also potentially breaking effects on mremap() which doesn't work accross > multiple VMA's. I said this repeatedly to you and Liam but you both still seemed to want this :)) Anyway, yes. I mean you're pretty crazy if you are mapping a bunch of adjacent DAX ranges that are all otherwise mergeable next to one another, then on that basis assuming that you can mremap() the whole thing. > > > optimisation is worth the effort and maintenance risk of having to unwind > > state and perform a merge. > > What if we simply created a new vma but only then checked if the flags > changed and we can merge it with neighbours (i.e. like the mprotect() > merging case). Less efficient, but less tricky and with the same result > hopefully? I'd probably rather just drop this idea rather than wade into something entirely new, but let me look at whether we can leverage vma_modify_flags(). I have a feeling we can't because we already explicitly reset state in the merge new VMA case, and we'd be introducing a new way in which state could get mangled. But I'll take a look and see, otherwise we can just drop this for now and potentially come back to it later, the key bit of the non-backport patches are 5-7 anyway. > > > Since we've observed bugs and resource leaks due to complexity in this > > area, it is simply not acceptable to have a 'nice to have' optimisation > > like this complicating an already very complicated code path, so let's > > simply eliminate it. > > > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> > > --- > > mm/vma.c | 39 ++------------------------------------- > > 1 file changed, 2 insertions(+), 37 deletions(-) > > > > diff --git a/mm/vma.c b/mm/vma.c > > index a271e2b406ab..fe1fe5099e78 100644 > > --- a/mm/vma.c > > +++ b/mm/vma.c > > @@ -2260,8 +2260,7 @@ static int __mmap_prepare(struct mmap_state *map) > > return 0; > > } > > > > -static int __mmap_new_file_vma(struct mmap_state *map, struct vm_area_struct *vma, > > - struct vm_area_struct **mergep) > > +static int __mmap_new_file_vma(struct mmap_state *map, struct vm_area_struct *vma) > > { > > struct vma_iterator *vmi = map->vmi; > > struct vma_merge_struct *vmg = map->vmg; > > @@ -2291,34 +2290,6 @@ static int __mmap_new_file_vma(struct mmap_state *map, struct vm_area_struct *vm > > (vma->vm_flags & VM_MAYWRITE)); > > > > vma_iter_config(vmi, vmg->start, vmg->end); > > - /* > > - * If flags changed after mmap_file(), we should try merge > > - * vma again as we may succeed this time. > > - */ > > - if (unlikely(map->flags != vma->vm_flags && vmg->prev)) { > > - struct vm_area_struct *merge; > > - > > - vmg->flags = vma->vm_flags; > > - /* If this fails, state is reset ready for a reattempt. */ > > - merge = vma_merge_new_range(vmg); > > - > > - if (merge) { > > - /* > > - * ->mmap() can change vma->vm_file and fput > > - * the original file. So fput the vma->vm_file > > - * here or we would add an extra fput for file > > - * and cause general protection fault > > - * ultimately. > > - */ > > - fput(vma->vm_file); > > - vm_area_free(vma); > > - vma_iter_free(vmi); > > - *mergep = merge; > > - } else { > > - vma_iter_config(vmi, vmg->start, vmg->end); > > - } > > - } > > - > > map->flags = vma->vm_flags; > > return 0; > > } > > @@ -2341,7 +2312,6 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap) > > { > > struct vma_iterator *vmi = map->vmi; > > struct vma_merge_struct *vmg = map->vmg; > > - struct vm_area_struct *merge = NULL; > > int error = 0; > > struct vm_area_struct *vma; > > > > @@ -2365,7 +2335,7 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap) > > } > > > > if (vmg->file) > > - error = __mmap_new_file_vma(map, vma, &merge); > > + error = __mmap_new_file_vma(map, vma); > > else if (map->flags & VM_SHARED) > > error = shmem_zero_setup(vma); > > else > > @@ -2374,9 +2344,6 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap) > > if (error) > > goto free_iter_vma; > > > > - if (merge) > > - goto file_expanded; > > - > > #ifdef CONFIG_SPARC64 > > /* TODO: Fix SPARC ADI! */ > > WARN_ON_ONCE(!arch_validate_flags(map->flags)); > > @@ -2393,8 +2360,6 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap) > > * call covers the non-merge case. > > */ > > khugepaged_enter_vma(vma, map->flags); > > - > > -file_expanded: > > ksm_add_vma(vma); > > > > *vmap = vma; > > -- > > 2.47.0 >