On Thu, Jul 04, 2024 at 02:27:13PM GMT, Liam R. Howlett wrote: > From: "Liam R. Howlett" <Liam.Howlett@xxxxxxxxxx> > > Set the start and end address for munmap when the prev and next are > gathered. This is needed to avoid incorrect addresses being used during > the vms_complete_munmap_vmas() function if the prev/next vma are > expanded. When we spoke about this separately you mentioned that specific arches may be more likely to encounter this issue, perhaps worth mentioning something about that in the commit msg? Unless I misunderstood you. > > Add a new helper vms_complete_pte_clear(), which is needed later and > will avoid growing the argument list to unmap_region() beyond the 9 it > already has. My word. > > Signed-off-by: Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> > --- > mm/internal.h | 2 ++ > mm/mmap.c | 34 +++++++++++++++++++++++++++------- > 2 files changed, 29 insertions(+), 7 deletions(-) > > diff --git a/mm/internal.h b/mm/internal.h > index 8cbbbe7d40f3..4c9f06669cc4 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -1493,6 +1493,8 @@ struct vma_munmap_struct { > struct list_head *uf; /* Userfaultfd list_head */ > unsigned long start; /* Aligned start addr */ > unsigned long end; /* Aligned end addr */ > + unsigned long unmap_start; > + unsigned long unmap_end; > int vma_count; /* Number of vmas that will be removed */ > unsigned long nr_pages; /* Number of pages being removed */ > unsigned long locked_vm; /* Number of locked pages */ > diff --git a/mm/mmap.c b/mm/mmap.c > index ecf55d32e804..45443a53be76 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -525,6 +525,8 @@ static inline void init_vma_munmap(struct vma_munmap_struct *vms, > vms->vma_count = 0; > vms->nr_pages = vms->locked_vm = vms->nr_accounted = 0; > vms->exec_vm = vms->stack_vm = vms->data_vm = 0; > + vms->unmap_start = FIRST_USER_ADDRESS; > + vms->unmap_end = USER_PGTABLES_CEILING; > } > > /* > @@ -2610,6 +2612,26 @@ static inline void abort_munmap_vmas(struct ma_state *mas_detach) > __mt_destroy(mas_detach->tree); > } > > + > +static void vms_complete_pte_clear(struct vma_munmap_struct *vms, > + struct ma_state *mas_detach, bool mm_wr_locked) > +{ > + struct mmu_gather tlb; > + > + /* > + * We can free page tables without write-locking mmap_lock because VMAs > + * were isolated before we downgraded mmap_lock. > + */ > + mas_set(mas_detach, 1); > + lru_add_drain(); > + tlb_gather_mmu(&tlb, vms->mm); > + update_hiwater_rss(vms->mm); > + unmap_vmas(&tlb, mas_detach, vms->vma, vms->start, vms->end, vms->vma_count, mm_wr_locked); > + mas_set(mas_detach, 1); I know it's necessary as unmap_vmas() will adjust mas_detach, but it kind of aesthetically sucks to set it to 1, do some stuff, then set it to 1 again. But this is not a big deal :>) > + free_pgtables(&tlb, mas_detach, vms->vma, vms->unmap_start, vms->unmap_end, mm_wr_locked); Yeah this bit definitely needs a comment I think, this is very confusing indeed. Under what circumstances will these differ from [vms->start, vms->end), etc.? I'm guessing it's to do with !vms->prev and !vms->next needing to be set to [FIRST_USER_ADDRESS, USER_PGTABLES_CEILING)? > + tlb_finish_mmu(&tlb); > +} > + > /* > * vms_complete_munmap_vmas() - Finish the munmap() operation > * @vms: The vma munmap struct > @@ -2631,13 +2653,7 @@ static void vms_complete_munmap_vmas(struct vma_munmap_struct *vms, > if (vms->unlock) > mmap_write_downgrade(mm); > > - /* > - * We can free page tables without write-locking mmap_lock because VMAs > - * were isolated before we downgraded mmap_lock. > - */ > - mas_set(mas_detach, 1); > - unmap_region(mm, mas_detach, vms->vma, vms->prev, vms->next, > - vms->start, vms->end, vms->vma_count, !vms->unlock); > + vms_complete_pte_clear(vms, mas_detach, !vms->unlock); > /* Update high watermark before we lower total_vm */ > update_hiwater_vm(mm); > /* Stat accounting */ > @@ -2699,6 +2715,8 @@ static int vms_gather_munmap_vmas(struct vma_munmap_struct *vms, > goto start_split_failed; > } > vms->prev = vma_prev(vms->vmi); > + if (vms->prev) > + vms->unmap_start = vms->prev->vm_end; > > /* > * Detach a range of VMAs from the mm. Using next as a temp variable as > @@ -2757,6 +2775,8 @@ static int vms_gather_munmap_vmas(struct vma_munmap_struct *vms, > } > > vms->next = vma_next(vms->vmi); > + if (vms->next) > + vms->unmap_end = vms->next->vm_start; > > #if defined(CONFIG_DEBUG_VM_MAPLE_TREE) > /* Make sure no VMAs are about to be lost. */ > -- > 2.43.0 > Other than wanting some extra comments, this looks fine and I know how hard-won the unmap range bit of this change was so: Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>