On 10/26/21 10:44 AM, Nadav Amit wrote: >> "If software on one logical processor writes to a page while software on >> another logical processor concurrently clears the R/W flag in the >> paging-structure entry that maps the page, execution on some processors may >> result in the entry’s dirty flag being set (due to the write on the first >> logical processor) and the entry’s R/W flag being clear (due to the update >> to the entry on the second logical processor). This will never occur on a >> processor that supports control-flow enforcement technology (CET)” >> >> So I guess that this optimization can only be enabled when CET is enabled. >> >> :( > I still wonder whether the SDM comment applies to present bit vs dirty > bit atomicity as well. I think it's implicit. From "4.8 ACCESSED AND DIRTY FLAGS": "Whenever there is a write to a linear address, the processor sets the dirty flag (if it is not already set) in the paging- structure entry" There can't be a "write to a linear address" without a Present=1 PTE. If it were a Dirty=1,Present=1 PTE, there's no race because there might not be a write to the PTE at all. There's also this from the "4.10.4.3 Optional Invalidation" section: "no TLB entry or paging-structure cache entry is created with information from a paging-structure entry in which the P flag is 0." That means that we don't have to worry about the TLB doing something bonkers like caching a Dirty=1 bit from a Present=0 PTE. Is that what you were worried about?