On 26/03/2024 16:17, David Hildenbrand wrote: > On 15.02.24 13:17, Ryan Roberts wrote: >> This is an RFC for a series that aims to reduce the cost and complexity of >> ptep_get_lockless() for arm64 when supporting transparent contpte mappings [1]. >> The approach came from discussion with Mark and David [2]. >> >> It introduces a new helper, ptep_get_lockless_norecency(), which allows the >> access and dirty bits in the returned pte to be incorrect. This relaxation >> permits arm64's implementation to just read the single target pte, and avoids >> having to iterate over the full contpte block to gather the access and dirty >> bits, for the contpte case. >> >> It turns out that none of the call sites using ptep_get_lockless() require >> accurate access and dirty bit information, so we can also convert those sites. >> Although a couple of places need care (see patches 2 and 3). >> >> Arguably patch 3 is a bit fragile, given the wide accessibility of >> vmf->orig_pte. So it might make sense to drop this patch and stick to using >> ptep_get_lockless() in the page fault path. I'm keen to hear opinions. > > Yes. Especially as we have these pte_same() checks that might just fail now > because of wrong accessed/dirty bits? Which pte_same() checks are you referring to? I've changed them all to pte_same_norecency() which ignores the access/dirty bits when doing the comparison. > > Likely, we just want to read "the real deal" on both sides of the pte_same() > handling. Sorry I'm not sure I understand? You mean read the full pte including access/dirty? That's the same as dropping the patch, right? Of course if we do that, we still have to keep pte_get_lockless() around for this case. In an ideal world we would convert everything over to ptep_get_lockless_norecency() and delete ptep_get_lockless() to remove the ugliness from arm64. > >> >> I've chosen the name "recency" because it's shortish and somewhat descriptive, >> and is alredy used in a couple of places to mean similar things (see mglru and >> damon). I'm open to other names if anyone has better ideas. > > Not a native speaker; works for me. > >> >> If concensus is that this approach is generally acceptable, I intend to create a >> series in future to do a similar thing with ptep_get() -> ptep_get_norecency(). > > Yes, sounds good to me. >