Re: [RFC PATCH v3 12/24] x86/mm: Modify ptep_set_wrprotect and pmdp_set_wrprotect for _PAGE_DIRTY_SW

Andy Lutomirski <luto@xxxxxxxxxx> · Fri, 31 Aug 2018 10:46:51 -0700

On Thu, Aug 30, 2018 at 11:55 AM, Dave Hansen
<dave.hansen@xxxxxxxxxxxxxxx> wrote:
> On 08/30/2018 10:34 AM, Andy Lutomirski wrote:
>>> But, to keep B's TLB from picking up the entry, I think we can just make
>>> it !Present for a moment.  No TLB can cache it, and I believe the same
>>> "don't set Dirty on a !Writable entry" logic also holds for !Present
>>> (modulo a weird erratum or two).
>> Can we get documentation?  Pretty please?
>
> The accessed bit description in the SDM looks pretty good to me today:
>
>> Whenever the processor uses a paging-structure entry as part of
>> linear-address translation, it sets the accessed flag in that entry
>> (if it is not already set).
> If it's !Present, it can't used as part of a translation so can't be
> set.  I think that covers the thing I was unsure about.
>
> But, Dirty is a bit, er, muddier, but mostly because it only gets set on
> leaf entries:
>
>> Whenever there is a write to a linear address, the processor sets the
>> dirty flag (if it is not already set) in the paging- structure entry
>> that identifies the final physical address for the linear address
>> (either a PTE or a paging-structure entry in which the PS flag is
>> 1).
>
> That little hunk will definitely need to get updated with something like:
>
>         On processors enumerating support for CET, the processor will on
>         set the dirty flag on paging structure entries in which the W
>         flag is 1.

Can we get something much stronger, perhaps?  Like this:

On processors enumerating support for CET, the processor will write to
the accessed and/or dirty flags atomically, as if using the LOCK
CMPXCHG instruction.  The memory access, any cached entries in any
paging-structure caches, and the values in the paging-structure entry
before and after writing the A and/or D bits will all be consistent.

I'm sure this could be worded better.  The point is that the CPU
should, atomically, load the PTE, check if it allows the access, set A
and/or D appropriately, write the new value to the TLB, and use that
value for the access.  This is clearly a little bit slower than what
old CPUs could do when writing to an already-in-TLB writable non-dirty
entry, but new CPUs are going to have to atomically check the W bit.
(I assume that even old CPUs will *atomically* set the D bit as if by
LOCK BTS, but this is all very vague in the SDM IIRC.)