On Thu, 21 Oct 2021, Nadav Amit wrote: > From: Nadav Amit <namit@xxxxxxxxxx> > > Consistent use of the mmu_gather interface requires a call to > tlb_start_vma() and tlb_end_vma() for each VMA. free_pgtables() does not > follow this pattern. > > Certain architectures need tlb_start_vma() to be called in order for > tlb_update_vma_flags() to update the VMA flags (tlb->vma_exec and > tlb->vma_huge), which are later used for the proper TLB flush to be > issued. Since tlb_start_vma() is not called, this can lead to the wrong > VMA flags being used when the flush is performed. > > Specifically, the munmap syscall would call unmap_region(), which unmaps > the VMAs and then frees the page-tables. A flush is needed after > the page-tables are removed to prevent page-walk caches from holding > stale entries, but this flush would use the flags of the VMA flags of > the last VMA that was flushed. This does not appear to be right. > > Use tlb_start_vma() and tlb_end_vma() to prevent this from happening. > This might lead to unnecessary calls to flush_cache_range() on certain > arch's. If needed, a new flag can be added to mmu_gather to indicate > that the flush is not needed. > > Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx> > Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Cc: Andy Lutomirski <luto@xxxxxxxxxx> > Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx> > Cc: Will Deacon <will@xxxxxxxxxx> > Cc: Yu Zhao <yuzhao@xxxxxxxxxx> > Cc: Nick Piggin <npiggin@xxxxxxxxx> > Cc: x86@xxxxxxxxxx > Signed-off-by: Nadav Amit <namit@xxxxxxxxxx> > --- > mm/memory.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/mm/memory.c b/mm/memory.c > index 12a7b2094434..056fbfdd3c1f 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -412,6 +412,8 @@ void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma, > unlink_anon_vmas(vma); > unlink_file_vma(vma); > > + tlb_start_vma(tlb, vma); > + > if (is_vm_hugetlb_page(vma)) { > hugetlb_free_pgd_range(tlb, addr, vma->vm_end, > floor, next ? next->vm_start : ceiling); > @@ -429,6 +431,8 @@ void free_pgtables(struct mmu_gather *tlb, struct vm_area_struct *vma, > free_pgd_range(tlb, addr, vma->vm_end, > floor, next ? next->vm_start : ceiling); > } > + > + tlb_end_vma(tlb, vma); > vma = next; > } > } > -- > 2.25.1 No. This is an experiment to see whether reviewers look at a wider context than is seen in the patch itself? Let's take a look: tlb_start_vma(tlb, vma); if (is_vm_hugetlb_page(vma)) { hugetlb_free_pgd_range(tlb, addr, vma->vm_end, floor, next ? next->vm_start : ceiling); } else { /* * Optimization: gather nearby vmas into one call down */ while (next && next->vm_start <= vma->vm_end + PMD_SIZE && !is_vm_hugetlb_page(next)) { vma = next; next = vma->vm_next; unlink_anon_vmas(vma); unlink_file_vma(vma); } free_pgd_range(tlb, addr, vma->vm_end, floor, next ? next->vm_start : ceiling); } tlb_end_vma(tlb, vma); vma = next; So, the vma may well have changed in between the new tlb_start_vma() and tlb_end_vma(): which defeats the intent of the patch. And I doubt that optimization should be dropped to suit the patch: you admit to limited understanding of those architectures which need tlb_start_vma(), me too; but they seem to have survived many years without it there in free_pgtables(), and I think that tlb_start_vma() is for when freeing pages, not for when freeing page tables. Surely all architectures have to accommodate the fact that a single page table can be occupied by many different kinds of vma. (Sorry, I'm being totally unresponsive at present, needing to focus on something else; but thought I'd better get these comments in before mmotm's mm-use-correct-vma-flags-when-freeing-page-tables.patch goes to 5.16.) Hugh