On Mon, Jun 19, 2017 at 10:11:35AM -0700, Nadav Amit wrote: > Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> wrote: > > > We need an atomic way to setup pmd page table entry, avoiding races with > > CPU setting dirty/accessed bits. This is required to implement > > pmdp_invalidate() that doesn't loose these bits. > > > > On PAE we have to use cmpxchg8b as we cannot assume what is value of new pmd and > > setting it up half-by-half can expose broken corrupted entry to CPU. > > ... > > > > > +#ifndef pmdp_establish > > +#define pmdp_establish pmdp_establish > > +static inline pmd_t pmdp_establish(pmd_t *pmdp, pmd_t pmd) > > +{ > > + if (IS_ENABLED(CONFIG_SMP)) { > > + return xchg(pmdp, pmd); > > + } else { > > + pmd_t old = *pmdp; > > + *pmdp = pmd; > > I think you may want to use WRITE_ONCE() here - otherwise nobody guarantees > that the compiler will not split writes to *pmdp. Although the kernel uses > similar code to setting PTEs and PMDs, I think that it is best to start > fixing it. Obviously, you might need a different code path for 32-bit > kernels. This code is for 2-level pageing on 32-bit machines and for 4-level paging on 64-bit machine. In both cases sizeof(pmd_t) == sizeof(unsigned long). Sane compiler can't screw up anything here -- store of long is one shot. Compiler still can issue duplicate of store, but there's no harm. It guaranteed to be stable once ptl is released and CPU can't the entry half-updated. -- Kirill A. Shutemov