Hi, Nadav, On Sun, Dec 20, 2020 at 12:06:38AM -0800, Nadav Amit wrote: [...] > So to correct myself, I think that what I really encountered was actually > during MM_CP_UFFD_WP_RESOLVE (i.e., when the protection is removed). The > problem was that in this case the “write”-bit was removed during unprotect. > Sorry for the strange formatting to fit within 80 columns: I assume I can ignore the race mentioned in the commit message but only refer to this one below. However I'm still confused. Please see below. > > > [ Start: PTE is writable ] > > cpu0 cpu1 cpu2 > ---- ---- ---- > [ Writable PTE > cached in TLB ] Here cpu2 got writable pte in tlb. But why? If below is an unprotect, it means it must have been protected once by userfaultfd, right? If so, the previous change_protection_range() which did the wr-protect should have done a tlb flush already before it returns (since pages>0 - we protected one pte at least). Then I can't see why cpu2 tlb has stall data. If I assume cpu2 doesn't have that cached tlb, then "write to old page" won't happen either, because cpu1/cpu2 will all go through the cow path and pgtable lock should serialize them. > userfaultfd_writeprotect() > [ write-*unprotect* ] > mwriteprotect_range() > mmap_read_lock() > change_protection() > > change_protection_range() > ... > change_pte_range() > [ *clear* “write”-bit ] > [ defer TLB flushes] > [ page-fault ] > … > wp_page_copy() > cow_user_page() > [ copy page ] > [ write to old > page ] > … > set_pte_at_notify() > > [ End: cpu2 write not copied form old to new page. ] Could you share how to reproduce the problem? I would be glad to give it a shot as well. > [1] https://lore.kernel.org/patchwork/patch/1346386 PS: Sorry to not have read the other series of yours. It seems to need some chunk of time so I postponed it a bit due to other things; but I'll read at least the fixes very soon. Thanks, -- Peter Xu