On Thu, 6 Feb 2025 at 05:46, Rik van Riel <riel@xxxxxxxxxxx> wrote: > /* Wait for INVLPGB originated by this CPU to complete. */ > -static inline void tlbsync(void) > +static inline void __tlbsync(void) > { > - cant_migrate(); Why does this have to go away? > diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h > index 234277a5ef89..bf167e215e8e 100644 > --- a/arch/x86/include/asm/tlbflush.h > +++ b/arch/x86/include/asm/tlbflush.h > @@ -106,6 +106,7 @@ struct tlb_state { > * need to be invalidated. > */ > bool invalidate_other; > + bool need_tlbsync; The ifdeffery is missing here. > +static inline void invlpgb_flush_user_nr_nosync(unsigned long pcid, > + unsigned long addr, > + u16 nr, bool pmd_stride) > +{ > + __invlpgb_flush_user_nr_nosync(pcid, addr, nr, pmd_stride); > + if (!this_cpu_read(cpu_tlbstate.need_tlbsync)) > + this_cpu_write(cpu_tlbstate.need_tlbsync, true); Why do we need the conditional here? I always thought code like this was about avoiding cacheline contention, but given this is a percpu and it's really only of interest to the present CPU, is this worthwhile? I guess it might be sharing a cacheline with some other percpu that is more shared? Anyway, not a big deal, I'm mostly asking for my own education. > @@ -794,6 +825,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, > if (IS_ENABLED(CONFIG_PROVE_LOCKING)) > WARN_ON_ONCE(!irqs_disabled()); > > + tlbsync(); > + > /* > * Verify that CR3 is what we think it is. This will catch > * hypothetical buggy code that directly switches to swapper_pg_dir > @@ -973,6 +1006,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, > */ > void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk) > { > + tlbsync(); > + I have a feeling I'll look stupid for asking this, but why do we need this and the one in switch_mm_irqs_off()? > @@ -1661,12 +1694,53 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) > local_irq_enable(); > } > > + /* > + * If we issued (asynchronous) INVLPGB flushes, wait for them here. > + * The cpumask above contains only CPUs that were running tasks > + * not using broadcast TLB flushing. > + */ > + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) It feels wrong that we check the cpufeature here but not in e.g. enter_lazy_tlb(). > + tlbsync(); > + > cpumask_clear(&batch->cpumask); > > put_flush_tlb_info(); > put_cpu(); > } > > +void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch, > + struct mm_struct *mm, > + unsigned long uaddr) > +{ > + u16 asid = mm_global_asid(mm); > + > + if (asid) { > + invlpgb_flush_user_nr_nosync(kern_pcid(asid), uaddr, 1, false); > + /* Do any CPUs supporting INVLPGB need PTI? */ Not today, but 1. I don't think avoiding static_cpu_has() is worth the cost of making the reader take that logical step. 2. AFAIK a new CPU bug never led to enabling KPTI on a CPU that didn't have it before, and I think that would be a pretty dystopian future (and hopefully by then we'll have ASI instead...). But we can't really rule it out.