On Thu, Jan 23, 2025 at 11:12:33PM +0000, Roman Gushchin wrote: > > +static inline void tlb_free_vma(struct mmu_gather *tlb, struct vm_area_struct *vma) > > { > > if (tlb->fullmm) > > return; > > > > /* > > * VM_PFNMAP is more fragile because the core mm will not track the > > + * page mapcount -- there might not be page-frames for these PFNs > > + * after all. > > + * > > + * Specifically() there is a race between munmap() and > > + * unmap_mapping_range(), where munmap() will unlink the VMA, such > > + * that unmap_mapping_range() will no longer observe the VMA and > > + * no-op, without observing the TLBI, returning prematurely. > > + * > > + * So if we're about to unlink such a VMA, and we have pending > > + * TLBI for such a vma, flush things now. > > */ > > + if ((vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP)) && tlb->vma_pfn) > > tlb_flush_mmu_tlbonly(tlb); > > Why do we need to re-check vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP) here? No need, but an opportunity. > In free_pgtables() we're iterating over multiple vma's. What if the first has > no VM_PFNMAP set, but some other do? Idk if it's even possible, but it's not > obvious that it's not possible either. If we only need to flush PFN entries before unlinking PFN VMAs, then: - if there are no PFNs pending (vma_pfn), we don't need to flush; - if no PFN vma is being freed (vm_flags), we don't need to flush. Notably, if an earlier flush has already issued the TLBI, there is no need to issue one again, but also, if we end up not actually freeing the PFN vma, we also don't care.