* Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> [241023 11:16]: > On Wed, Oct 23, 2024 at 05:01:29PM +0200, Vlastimil Babka wrote: > > On 10/22/24 22:40, Lorenzo Stoakes wrote: > > > Previously, we'd always try to merge a file-backed VMA if its flags were > > > changed by the driver. > > > > > > This however is rarely meaningful as typically the flags would be changed > > > to VM_PFNMAP or other VM_SPECIAL flags which are inherently unmergable. > > > > > > In cases where it is meaningful (for instance DAX) it is doubtful that this > > > > Hm if that's true, I'm imagining many piecemeal mmap()s of DAX that used to > > merge but now will create tons of VMA's, which doesn't sound great. Then it > > has also potentially breaking effects on mremap() which doesn't work accross > > multiple VMA's. > > I said this repeatedly to you and Liam but you both still seemed to want > this :)) > > Anyway, yes. I mean you're pretty crazy if you are mapping a bunch of > adjacent DAX ranges that are all otherwise mergeable next to one another, > then on that basis assuming that you can mremap() the whole thing. Today, this will only work if there is a previous vma, otherwise we skip the attempt to merge. It's probably a safe bet that there is a previous vma, but it still means relying on this merging is broken today. > > > > > > optimisation is worth the effort and maintenance risk of having to unwind > > > state and perform a merge. > > > > What if we simply created a new vma but only then checked if the flags > > changed and we can merge it with neighbours (i.e. like the mprotect() > > merging case). Less efficient, but less tricky and with the same result > > hopefully? > > I'd probably rather just drop this idea rather than wade into something > entirely new, but let me look at whether we can leverage > vma_modify_flags(). > > I have a feeling we can't because we already explicitly reset state in the > merge new VMA case, and we'd be introducing a new way in which state could > get mangled. > > But I'll take a look and see, otherwise we can just drop this for now and > potentially come back to it later, the key bit of the non-backport patches > are 5-7 anyway. > > > > > > Since we've observed bugs and resource leaks due to complexity in this > > > area, it is simply not acceptable to have a 'nice to have' optimisation > > > like this complicating an already very complicated code path, so let's > > > simply eliminate it. > > > > > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> > > > --- > > > mm/vma.c | 39 ++------------------------------------- > > > 1 file changed, 2 insertions(+), 37 deletions(-) > > > > > > diff --git a/mm/vma.c b/mm/vma.c > > > index a271e2b406ab..fe1fe5099e78 100644 > > > --- a/mm/vma.c > > > +++ b/mm/vma.c > > > @@ -2260,8 +2260,7 @@ static int __mmap_prepare(struct mmap_state *map) > > > return 0; > > > } > > > > > > -static int __mmap_new_file_vma(struct mmap_state *map, struct vm_area_struct *vma, > > > - struct vm_area_struct **mergep) > > > +static int __mmap_new_file_vma(struct mmap_state *map, struct vm_area_struct *vma) > > > { > > > struct vma_iterator *vmi = map->vmi; > > > struct vma_merge_struct *vmg = map->vmg; > > > @@ -2291,34 +2290,6 @@ static int __mmap_new_file_vma(struct mmap_state *map, struct vm_area_struct *vm > > > (vma->vm_flags & VM_MAYWRITE)); > > > > > > vma_iter_config(vmi, vmg->start, vmg->end); > > > - /* > > > - * If flags changed after mmap_file(), we should try merge > > > - * vma again as we may succeed this time. > > > - */ > > > - if (unlikely(map->flags != vma->vm_flags && vmg->prev)) { > > > - struct vm_area_struct *merge; > > > - > > > - vmg->flags = vma->vm_flags; > > > - /* If this fails, state is reset ready for a reattempt. */ > > > - merge = vma_merge_new_range(vmg); > > > - > > > - if (merge) { > > > - /* > > > - * ->mmap() can change vma->vm_file and fput > > > - * the original file. So fput the vma->vm_file > > > - * here or we would add an extra fput for file > > > - * and cause general protection fault > > > - * ultimately. > > > - */ > > > - fput(vma->vm_file); > > > - vm_area_free(vma); > > > - vma_iter_free(vmi); > > > - *mergep = merge; > > > - } else { > > > - vma_iter_config(vmi, vmg->start, vmg->end); > > > - } > > > - } > > > - > > > map->flags = vma->vm_flags; > > > return 0; > > > } > > > @@ -2341,7 +2312,6 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap) > > > { > > > struct vma_iterator *vmi = map->vmi; > > > struct vma_merge_struct *vmg = map->vmg; > > > - struct vm_area_struct *merge = NULL; > > > int error = 0; > > > struct vm_area_struct *vma; > > > > > > @@ -2365,7 +2335,7 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap) > > > } > > > > > > if (vmg->file) > > > - error = __mmap_new_file_vma(map, vma, &merge); > > > + error = __mmap_new_file_vma(map, vma); > > > else if (map->flags & VM_SHARED) > > > error = shmem_zero_setup(vma); > > > else > > > @@ -2374,9 +2344,6 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap) > > > if (error) > > > goto free_iter_vma; > > > > > > - if (merge) > > > - goto file_expanded; > > > - > > > #ifdef CONFIG_SPARC64 > > > /* TODO: Fix SPARC ADI! */ > > > WARN_ON_ONCE(!arch_validate_flags(map->flags)); > > > @@ -2393,8 +2360,6 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap) > > > * call covers the non-merge case. > > > */ > > > khugepaged_enter_vma(vma, map->flags); > > > - > > > -file_expanded: > > > ksm_add_vma(vma); > > > > > > *vmap = vma; > > > -- > > > 2.47.0 > >