* Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> [241023 13:39]: > On Wed, Oct 23, 2024 at 11:21:54AM -0400, Liam R. Howlett wrote: > > * Vlastimil Babka <vbabka@xxxxxxx> [241023 10:39]: > > > On 10/22/24 22:40, Lorenzo Stoakes wrote: > > > > We have seen bugs and resource leaks arise from the complexity of the > > > > __mmap_region() function. This, and the generally deeply fragile error > > > > handling logic and complexity which makes understanding the function > > > > difficult make it highly desirable to refactor it into something readable. > > > > > > > > Achieve this by separating the function into smaller logical parts which > > > > are easier to understand and follow, and which importantly very > > > > significantly simplify the error handling. > > > > > > > > Note that we now call vms_abort_munmap_vmas() in more error paths than we > > > > used to, however in cases where no abort need occur, vms->nr_pages will be > > > > equal to zero and we simply exit this function without doing more than we > > > > would have done previously. > > > > > > > > Importantly, the invocation of the driver mmap hook via mmap_file() now has > > > > very simple and obvious handling (this was previously the most problematic > > > > part of the mmap() operation). > > > > > > > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> > > > > --- > > > > mm/vma.c | 380 +++++++++++++++++++++++++++++++++++-------------------- > > > > 1 file changed, 240 insertions(+), 140 deletions(-) > > > > > > > > diff --git a/mm/vma.c b/mm/vma.c > > > > index 7617f9d50d62..a271e2b406ab 100644 > > > > --- a/mm/vma.c > > > > +++ b/mm/vma.c > > > > @@ -7,6 +7,31 @@ > > > > #include "vma_internal.h" > > > > #include "vma.h" > > > > > > > > +struct mmap_state { > > > > + struct mm_struct *mm; > > > > + struct vma_iterator *vmi; > > > > + struct vma_merge_struct *vmg; > > > > + struct list_head *uf; > > > > + > > > > + struct vma_munmap_struct vms; > > > > + struct ma_state mas_detach; > > > > + struct maple_tree mt_detach; > > > > + > > > > + unsigned long flags; > > > > + unsigned long pglen; > > > > + unsigned long charged; > > > > +}; > > > > + > > > > +#define MMAP_STATE(name, mm_, vmi_, vmg_, uf_, flags_, len_) \ > > > > + struct mmap_state name = { \ > > > > + .mm = mm_, \ > > > > + .vmi = vmi_, \ > > > > + .vmg = vmg_, \ > > > > + .uf = uf_, \ > > > > + .flags = flags_, \ > > > > + .pglen = PHYS_PFN(len_), \ > > > > + } > > > > + > > > > static inline bool is_mergeable_vma(struct vma_merge_struct *vmg, bool merge_next) > > > > { > > > > struct vm_area_struct *vma = merge_next ? vmg->next : vmg->prev; > > > > @@ -2169,189 +2194,247 @@ static void vms_abort_munmap_vmas(struct vma_munmap_struct *vms, > > > > vms_complete_munmap_vmas(vms, mas_detach); > > > > } > > > > > > > > -unsigned long __mmap_region(struct file *file, unsigned long addr, > > > > - unsigned long len, vm_flags_t vm_flags, unsigned long pgoff, > > > > - struct list_head *uf) > > > > +/* > > > > + * __mmap_prepare() - Prepare to gather any overlapping VMAs that need to be > > > > + * unmapped once the map operation is completed, check limits, > > > > + * account mapping and clean up any pre-existing VMAs. > > > > + * > > > > + * @map: Mapping state. > > > > + * > > > > + * Returns: 0 on success, error code otherwise. > > > > + */ > > > > +static int __mmap_prepare(struct mmap_state *map) > > > > { > > > > - struct mm_struct *mm = current->mm; > > > > - struct vm_area_struct *vma = NULL; > > > > - pgoff_t pglen = PHYS_PFN(len); > > > > - unsigned long charged = 0; > > > > - struct vma_munmap_struct vms; > > > > - struct ma_state mas_detach; > > > > - struct maple_tree mt_detach; > > > > - unsigned long end = addr + len; > > > > int error; > > > > - VMA_ITERATOR(vmi, mm, addr); > > > > - VMG_STATE(vmg, mm, &vmi, addr, end, vm_flags, pgoff); > > > > - > > > > - vmg.file = file; > > > > - /* Find the first overlapping VMA */ > > > > - vma = vma_find(&vmi, end); > > > > - init_vma_munmap(&vms, &vmi, vma, addr, end, uf, /* unlock = */ false); > > > > - if (vma) { > > > > - mt_init_flags(&mt_detach, vmi.mas.tree->ma_flags & MT_FLAGS_LOCK_MASK); > > > > - mt_on_stack(mt_detach); > > > > - mas_init(&mas_detach, &mt_detach, /* addr = */ 0); > > > > + struct vma_iterator *vmi = map->vmi; > > > > + struct vma_merge_struct *vmg = map->vmg; > > > > + struct vma_munmap_struct *vms = &map->vms; > > > > + > > > > + /* Find the first overlapping VMA and initialise unmap state. */ > > > > + vms->vma = vma_find(vmi, vmg->end); > > > > + init_vma_munmap(vms, vmi, vms->vma, vmg->start, vmg->end, map->uf, > > > > + /* unlock = */ false); > > > > + > > > > + /* OK, we have overlapping VMAs - prepare to unmap them. */ > > > > + if (vms->vma) { > > > > + mt_init_flags(&map->mt_detach, vmi->mas.tree->ma_flags & MT_FLAGS_LOCK_MASK); > > > > + mt_on_stack(map->mt_detach); > > > > + mas_init(&map->mas_detach, &map->mt_detach, /* addr = */ 0); > > > > /* Prepare to unmap any existing mapping in the area */ > > > > - error = vms_gather_munmap_vmas(&vms, &mas_detach); > > > > + error = vms_gather_munmap_vmas(vms, &map->mas_detach); > > > > if (error) > > > > - goto gather_failed; > > > > + return error; > > > > > > So this assumes vms_abort_munmap_vmas() will rely on the "vms->nr_pages will > > > be equal to zero" mentioned in commit log. But AFAICS > > > vms_gather_munmap_vmas() can fail in Nth iteration of its > > > for_each_vma_range() after some iterations already increased nr_pages and it > > > will do a reattach_vmas() and return the error and we just return the error > > > here. > > > I think either here or maybe in vms_gather_munmap_vmas() itself a reset of > > > vms->nr_pages to zero on error should happen for the vms_abort_munmap_vmas() > > > to be a no-op? > > > > Probably in reattach_vmas()? > > Hm, but that only accepts a mas and seems redundant elsewhere... am going for > simply resetting nr_pages for now and maybe we can revisit if needs be? Okay.