On Mon, 6 Mar 2023 17:06:44 +0000 Catalin Marinas <catalin.marinas@xxxxxxx> wrote: > On Mon, Mar 06, 2023 at 05:15:48PM +0100, Gerald Schaefer wrote: > > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h > > index b6ba466e2e8a..0bd18de9fd97 100644 > > --- a/arch/arm64/include/asm/pgtable.h > > +++ b/arch/arm64/include/asm/pgtable.h > > @@ -57,7 +57,7 @@ static inline bool arch_thp_swp_supported(void) > > * fault on one CPU which has been handled concurrently by another CPU > > * does not need to perform additional invalidation. > > */ > > -#define flush_tlb_fix_spurious_fault(vma, address) do { } while (0) > > +#define flush_tlb_fix_spurious_fault(vma, address, ptep) do { } while (0) > > For arm64: > > Acked-by: Catalin Marinas <catalin.marinas@xxxxxxx> > > > diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h > > index 2c70b4d1263d..c1f6b46ec555 100644 > > --- a/arch/s390/include/asm/pgtable.h > > +++ b/arch/s390/include/asm/pgtable.h > > @@ -1239,7 +1239,8 @@ static inline int pte_allow_rdp(pte_t old, pte_t new) > > } > > > > static inline void flush_tlb_fix_spurious_fault(struct vm_area_struct *vma, > > - unsigned long address) > > + unsigned long address, > > + pte_t *ptep) > > { > > /* > > * RDP might not have propagated the PTE protection reset to all CPUs, > > @@ -1247,11 +1248,12 @@ static inline void flush_tlb_fix_spurious_fault(struct vm_area_struct *vma, > > * NOTE: This will also be called when a racing pagetable update on > > * another thread already installed the correct PTE. Both cases cannot > > * really be distinguished. > > - * Therefore, only do the local TLB flush when RDP can be used, to avoid > > - * unnecessary overhead. > > + * Therefore, only do the local TLB flush when RDP can be used, and the > > + * PTE does not have _PAGE_PROTECT set, to avoid unnecessary overhead. > > + * A local RDP can be used to do the flush. > > */ > > - if (MACHINE_HAS_RDP) > > - asm volatile("ptlb" : : : "memory"); > > + if (MACHINE_HAS_RDP && !(pte_val(*ptep) & _PAGE_PROTECT)) > > + __ptep_rdp(address, ptep, 0, 0, 1); > > I wonder whether passing the actual entry is somewhat quicker as it > avoids another memory access (though it might already be in the cache). The RDP instruction itself only requires the PTE pointer as input, or more precisely a pointer to the pagetable origin. We calculate that from the PTE pointer, by masking out some bits, w/o actual memory access to the PTE entry value. Of course, there is the pte_val(*ptep) & _PAGE_PROTECT check here, with memory access, but this might get removed in the future. TBH, I simply wasn't sure (enough) yet, if we could technically ever end up here with _PAGE_PROTECT set at all. For "real" spurious protection faults, it should never be set, not so sure about racing pagetable updates though. So this might actually be an unnecessary / overly cautious check, that gets removed in the future, and not worth passing along the PTE value in addition to the pointer.