Re: [PATCH v5 19/25] arm64/mm: Wire up PTE_CONT for user mappings

Ryan Roberts <ryan.roberts@xxxxxxx> · Tue, 13 Feb 2024 15:29:24 +0000

On 12/02/2024 16:24, David Hildenbrand wrote:
> On 12.02.24 16:34, Ryan Roberts wrote:
>> On 12/02/2024 15:26, David Hildenbrand wrote:
>>> On 12.02.24 15:45, Ryan Roberts wrote:
>>>> On 12/02/2024 13:54, David Hildenbrand wrote:
>>>>>>> If so, I wonder if we could instead do that comparison modulo the
>>>>>>> access/dirty
>>>>>>> bits,
>>>>>>
>>>>>> I think that would work - but will need to think a bit more on it.
>>>>>>
>>>>>>> and leave ptep_get_lockless() only reading a single entry?
>>>>>>
>>>>>> I think we will need to do something a bit less fragile. ptep_get() does
>>>>>> collect
>>>>>> the access/dirty bits so its confusing if ptep_get_lockless() doesn't
>>>>>> IMHO. So
>>>>>> we will likely want to rename the function and make its documentation
>>>>>> explicit
>>>>>> that it does not return those bits.
>>>>>>
>>>>>> ptep_get_lockless_noyoungdirty()? yuk... Any ideas?
>>>>>>
>>>>>> Of course if I could convince you the current implementation is safe, I
>>>>>> might be
>>>>>> able to sidestep this optimization until a later date?
>>>>>
>>>>> As discussed (and pointed out abive), there might be quite some callsites
>>>>> where
>>>>> we don't really care about uptodate accessed/dirty bits -- where ptep_get() is
>>>>> used nowadays.
>>>>>
>>>>> One way to approach that I had in mind was having an explicit interface:
>>>>>
>>>>> ptep_get()
>>>>> ptep_get_uptodate()
>>>>> ptep_get_lockless()
>>>>> ptep_get_lockless_uptodate()
>>>>
>>>> Yes, I like the direction of this. I guess we anticipate that call sites
>>>> requiring the "_uptodate" variant will be the minority so it makes sense to use
>>>> the current names for the "_not_uptodate" variants? But to do a slow migration,
>>>> it might be better/safer to have the weaker variant use the new name - that
>>>> would allow us to downgrade one at a time?
>>>
>>> Yes, I was primarily struggling with names. Likely it makes sense to either have
>>> two completely new function names, or use the new name only for the "faster but
>>> less precise" variant.
>>>
>>>>
>>>>>
>>>>> Especially the last one might not be needed.
>>>> I've done a scan through the code and agree with Mark's original conclusions.
>>>> Additionally, huge_pte_alloc() (which isn't used for arm64) doesn't rely on
>>>> access/dirty info. So I think I could migrate everything to the weaker variant
>>>> fairly easily.
>>>>
>>>>>
>>>>> Futher, "uptodate" might not be the best choice because of PageUptodate() and
>>>>> friends. But it's better than "youngdirty"/"noyoungdirty" IMHO.
>>>>
>>>> Certainly agree with "noyoungdirty" being a horrible name. How about "_sync" /
>>>> "_nosync"?
>>>
>>> I could live with
>>>
>>> ptep_get_sync()
>>> ptep_get_nosync()
>>>
>>> with proper documentation :)
>>
>> but could you live with:
>>
>> ptep_get()
>> ptep_get_nosync()
>> ptep_get_lockless_nosync()
>>
>> ?
>>
>> So leave the "slower, more precise" version with the existing name.
> 
> Sure.
> 

I'm just implementing this (as a separate RFC), and had an alternative idea for
naming/semantics:

ptep_get()
ptep_get_norecency()
ptep_get_lockless()
ptep_get_lockless_norecency()

The "_norecency" versions explicitly clear the access/dirty bits. This is useful
for the "compare to original pte to check we are not racing" pattern:

pte = ptep_get_lockless_norecency(ptep)
...
<lock>
if (!pte_same(pte, ptep_get_norecency(ptep)))
	// RACE!
...
<unlock>

With the "_nosync" semantic, the access/dirty bits may or may not be set, so the
user has to explicitly clear them to do the comparison. (although I considered a
pte_same_nosync() that would clear the bits for you - but that name is pretty naff).

Although the _norecency semantic requires always explicitly clearing the bits,
so may be infinitesimally slower, it gives a very clear expectation that the
access/dirty bits are always clear and I think that's conveyed well in the name too.

Thoughts?