> On Jan 4, 2021, at 11:24 AM, Andrea Arcangeli <aarcange@xxxxxxxxxx> wrote: > > Hello, > > On Mon, Jan 04, 2021 at 01:22:27PM +0100, Peter Zijlstra wrote: >> On Fri, Dec 25, 2020 at 01:25:28AM -0800, Nadav Amit wrote: >> >>> The scenario that happens in selftests/vm/userfaultfd is as follows: >>> >>> cpu0 cpu1 cpu2 >>> ---- ---- ---- >>> [ Writable PTE >>> cached in TLB ] >>> userfaultfd_writeprotect() >>> [ write-*unprotect* ] >>> mwriteprotect_range() >>> mmap_read_lock() >>> change_protection() >>> >>> change_protection_range() >>> ... >>> change_pte_range() >>> [ *clear* “write”-bit ] >>> [ defer TLB flushes ] >>> [ page-fault ] >>> ... >>> wp_page_copy() >>> cow_user_page() >>> [ copy page ] >>> [ write to old >>> page ] >>> ... >>> set_pte_at_notify() >> >> Yuck! > > Note, the above was posted before we figured out the details so it > wasn't showing the real deferred tlb flush that caused problems (the > one showed on the left causes zero issues). Actually it was posted after (note that this is v2). The aforementioned scenario that Peter regards to is the one that I actually encountered (not the second scenario that is “theoretical”). This scenario that Peter regards is indeed more “stupid” in the sense that we should just not write-protect the PTE on userfaultfd write-unprotect. Let me know if I made any mistake in the description. > The problematic one not pictured is the one of the wrprotect that has > to be running in another CPU which is also isn't picture above. More > accurate traces are posted later in the thread. I think I included this scenario as well in the commit log (of v2). Let me know if I screwed up and the description is not clear.