On Wed, Oct 26, 2022 at 06:45:16PM +0200, Jann Horn wrote: > > #endif /* _LINUX_MM_H */ > > diff --git a/mm/memory.c b/mm/memory.c > > index f88c351aecd4..9bb63b3fbee1 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -1440,6 +1440,11 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, > > tlb_remove_tlb_entry(tlb, pte, addr); > > zap_install_uffd_wp_if_needed(vma, addr, pte, details, > > ptent); > > + > > + if (!force_flush && !tlb->fullmm && details && > > + details->zap_flags & ZAP_FLAG_FORCE_FLUSH) > > + force_flush = 1; > > + > > Hmm... I guess that might work, assuming that there is no other > codepath we might race with that first turns the present PTE into a > non-present PTE but keeps the flush queued for later. At least > codepaths that use the tlb_batched infrastructure are unproblematic... So I thought the general rule was that if you modify a PTE and have not unmapped things -- IOW, there's actual concurrency possible on the thing, then the TLB invalidate needs to happen under pte_lock, since that is what controls concurrency at the pte level. As it stands MADV_DONTNEED seems to blatatly violate that general rule. Then again; I could've missed something and the rules changed?