Re: [PATCH v2 10/39] x86/mm: Introduce _PAGE_COW

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2022-10-03 at 19:26 +0300, Kirill A . Shutemov wrote:
> On Thu, Sep 29, 2022 at 03:29:07PM -0700, Rick Edgecombe wrote:
> > +/*
> > + * Normally the Dirty bit is used to denote COW memory on x86. But
> > + * in the case of X86_FEATURE_SHSTK, the software COW bit is used,
> > + * since the Dirty=1,Write=0 will result in the memory being
> > treated
> > + * as shaodw stack by the HW. So when creating COW memory, a
> > software
> > + * bit is used _PAGE_BIT_COW. The following functions pte_mkcow()
> > and
> > + * pte_clear_cow() take a PTE marked conventially COW (Dirty=1)
> > and
> > + * transition it to the shadow stack compatible version of COW
> > (Cow=1).
> > + */
> > +
> > +static inline pte_t pte_mkcow(pte_t pte)
> > +{
> > +     if (!cpu_feature_enabled(X86_FEATURE_SHSTK))
> > +             return pte;
> > +
> > +     pte = pte_clear_flags(pte, _PAGE_DIRTY);
> > +     return pte_set_flags(pte, _PAGE_COW);
> > +}
> > +
> > +static inline pte_t pte_clear_cow(pte_t pte)
> > +{
> > +     /*
> > +      * _PAGE_COW is unnecessary on !X86_FEATURE_SHSTK kernels.
> > +      * See the _PAGE_COW definition for more details.
> > +      */
> > +     if (!cpu_feature_enabled(X86_FEATURE_SHSTK))
> > +             return pte;
> > +
> > +     /*
> > +      * PTE is getting copied-on-write, so it will be dirtied
> > +      * if writable, or made shadow stack if shadow stack and
> > +      * being copied on access. Set they dirty bit for both
> > +      * cases.
> > +      */
> > +     pte = pte_set_flags(pte, _PAGE_DIRTY);
> > +     return pte_clear_flags(pte, _PAGE_COW);
> > +}
> 
> These X86_FEATURE_SHSTK checks make me uneasy. Maybe use the
> _PAGE_COW
> logic for all machines with 64-bit entries. It will get you much more
> coverage and more universal rules.

Yes, I didn't like them either at first. The reasoning originally was
that _PAGE_COW is a bit more work and it might show up for some
benchmark.

Looking at this again though, it is just a few more operations on
memory that is already getting touched either way. It must be a very
tiny amount of impact if any. I'm fine removing them. Having just one
set of logic around this would make it easier to reason about.

Dave, any thoughts on this?




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux