On 12/02/2024 15:26, David Hildenbrand wrote: > On 12.02.24 15:45, Ryan Roberts wrote: >> On 12/02/2024 13:54, David Hildenbrand wrote: >>>>> If so, I wonder if we could instead do that comparison modulo the access/dirty >>>>> bits, >>>> >>>> I think that would work - but will need to think a bit more on it. >>>> >>>>> and leave ptep_get_lockless() only reading a single entry? >>>> >>>> I think we will need to do something a bit less fragile. ptep_get() does >>>> collect >>>> the access/dirty bits so its confusing if ptep_get_lockless() doesn't IMHO. So >>>> we will likely want to rename the function and make its documentation explicit >>>> that it does not return those bits. >>>> >>>> ptep_get_lockless_noyoungdirty()? yuk... Any ideas? >>>> >>>> Of course if I could convince you the current implementation is safe, I >>>> might be >>>> able to sidestep this optimization until a later date? >>> >>> As discussed (and pointed out abive), there might be quite some callsites where >>> we don't really care about uptodate accessed/dirty bits -- where ptep_get() is >>> used nowadays. >>> >>> One way to approach that I had in mind was having an explicit interface: >>> >>> ptep_get() >>> ptep_get_uptodate() >>> ptep_get_lockless() >>> ptep_get_lockless_uptodate() >> >> Yes, I like the direction of this. I guess we anticipate that call sites >> requiring the "_uptodate" variant will be the minority so it makes sense to use >> the current names for the "_not_uptodate" variants? But to do a slow migration, >> it might be better/safer to have the weaker variant use the new name - that >> would allow us to downgrade one at a time? > > Yes, I was primarily struggling with names. Likely it makes sense to either have > two completely new function names, or use the new name only for the "faster but > less precise" variant. > >> >>> >>> Especially the last one might not be needed. >> I've done a scan through the code and agree with Mark's original conclusions. >> Additionally, huge_pte_alloc() (which isn't used for arm64) doesn't rely on >> access/dirty info. So I think I could migrate everything to the weaker variant >> fairly easily. >> >>> >>> Futher, "uptodate" might not be the best choice because of PageUptodate() and >>> friends. But it's better than "youngdirty"/"noyoungdirty" IMHO. >> >> Certainly agree with "noyoungdirty" being a horrible name. How about "_sync" / >> "_nosync"? > > I could live with > > ptep_get_sync() > ptep_get_nosync() > > with proper documentation :) but could you live with: ptep_get() ptep_get_nosync() ptep_get_lockless_nosync() ? So leave the "slower, more precise" version with the existing name. > > I don't think we use "_sync" / "_nosync" in the context of pte operations yet. > > Well, there seems to be "__arm_v7s_pte_sync" in iommu code, bit at least in core > code nothing jumped at me. >