On Fri, May 19, 2023 at 3:12 AM Ryan Roberts <ryan.roberts@xxxxxxx> wrote: > > On 18/05/2023 20:28, Yu Zhao wrote: > > On Thu, May 18, 2023 at 5:07 AM Ryan Roberts <ryan.roberts@xxxxxxx> wrote: > >> > >> There are many call sites that directly dereference a pte_t pointer. > >> This makes it very difficult to properly encapsulate a page table in the > >> arch code without having to allocate shadow page tables. ptep_deref() > >> aims to solve this by replacing all direct dereferences with a call to > >> this function. > >> > >> The default implementation continues to just dereference the pointer > >> (*ptep), so generated code should be exactly the same. However, it is > >> possible for the architecture to override the default with their own > >> implementation, that can (e.g.) hide certain bits from the core code, or > >> determine young/dirty status by mixing in state from another source. > >> > >> While ptep_get() and ptep_get_lockless() already exist, these are > >> implemented as atomic accesses (e.g. READ_ONCE() in the default case). > >> So rather than using ptep_get() and risking performance regressions, > >> introduce an new variant. > > > > We should reuse ptep_get(): > > 1. I don't think READ_ONCE() can cause measurable regressions in this case. > > 2. It's technically wrong without it. > > Can you clarify what you mean by technically wrong? Are you saying that the > current code that does direct dereferencing is buggy? Sorry for not being clear. I think we can agree that *ptep is volatile. Not being treated as such seems a bad idea to me. I don't think it'd cause any real problems -- most warnings KCSAN reported didn't either, but we fixed them anyway. So should we fix this case as well while we are at it. > I previously convinced myself that the potential for the compiler generating > multiple loads was safe because the code in question is under the PTL so there > are no concurrent stores. And we shouldn't see any tearing for the same reason. > > That said, if there is concensus that we can just use ptep_get() (== > READ_ONCE()) everywhere, then I agree that would be cleaner. Does anyone object? (No objection to NOT using it either. Just a recommendation, since it's already there.)