On 1/25/2021 10:27 AM, Borislav Petkov wrote:
On Tue, Dec 29, 2020 at 01:30:38PM -0800, Yu-cheng Yu wrote:
[...]
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 666c25ab9564..1c84f1ba32b9 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1226,6 +1226,32 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm,
static inline void ptep_set_wrprotect(struct mm_struct *mm,
unsigned long addr, pte_t *ptep)
{
+ /*
+ * Some processors can start a write, but end up seeing a read-only
+ * PTE by the time they get to the Dirty bit. In this case, they
+ * will set the Dirty bit, leaving a read-only, Dirty PTE which
+ * looks like a shadow stack PTE.
+ *
+ * However, this behavior has been improved
Improved how?
Processors supporting Shadow Stack will not set a read-only PTE's dirty
bit. I will revise the comments.
and will not occur on
+ * processors supporting Shadow Stack. Without this guarantee, a
Which guarantee? That it won't happen on CPUs which support SHSTK?
Yes.
+ * transition to a non-present PTE and flush the TLB would be
s/flush the TLB/TLB flush/
+ * needed.
+ *
+ * When changing a writable PTE to read-only and if the PTE has
+ * _PAGE_DIRTY set, move that bit to _PAGE_COW so that the PTE is
+ * not a shadow stack PTE.
+ */
This sentence doesn't belong here as it refers to what pte_wrprotect()
does. You could expand the comment in pte_wrprotect() with this here as
it is better.
I will move this paragraph to pte_wrprotect().
+ if (cpu_feature_enabled(X86_FEATURE_SHSTK)) {
+ pte_t old_pte, new_pte;
+
+ do {
+ old_pte = READ_ONCE(*ptep);
+ new_pte = pte_wrprotect(old_pte);
Maybe I'm missing something but those two can happen outside of the
loop, no? Or is *ptep somehow changing concurrently while the loop is
doing the CMPXCHG and you need to recreate it each time?
IOW, you can generate upfront and do the empty loop...
*ptep can change concurrently.
+
+ } while (!try_cmpxchg(&ptep->pte, &old_pte.pte, new_pte.pte));
+
+ return;
+ }
clear_bit(_PAGE_BIT_RW, (unsigned long *)&ptep->pte);
}
[...]