On Fri, Feb 28, 2025 at 11:18:04AM -0800, Dave Hansen wrote: > We haven't talked at all about the locking rules for > invlpgb_flush_all(). It was used once in this series without any > explicit preempt twiddling. I assume that was because it was used in a > path where preempt is disabled. > > If it does need a universal rule about preempt, can we please add an: > > lockdep_assert_preemption_disabled() > > along with a comment about why it needs preempt disabled? So, after talking on IRC last night, below is what I think we should do ontop. More specifically: - I've pushed the preemption guard inside the functions which do INVLPGB+TLBSYNC so that callers do not have to think about it. - invlpgb_kernel_range_flush() I still don't like and we have to rely there on cant_migrate() in __tlbsync() - I'd like for all of them to be nicely packed but don't have an idea yet how to do that cleanly... - document what means for bits rax[0:2] being clear when issuing INVLPGB That ok? Anything I've missed? If not, I'll integrate this into the patches. Thx. diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index 45d9c7687d61..0d90ceeb472b 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -39,6 +39,10 @@ static inline void invlpg(unsigned long addr) * the first page, while __invlpgb gets the more human readable number of * pages to invalidate. * + * The bits in rax[0:2] determine respectively which components of the address + * (VA, PCID, ASID) get compared when flushing. If neither bits are set, *any* + * address in the specified range matches. + * * TLBSYNC is used to ensure that pending INVLPGB invalidations initiated from * this CPU have completed. */ @@ -60,10 +64,10 @@ static inline void __invlpgb(unsigned long asid, unsigned long pcid, static inline void __tlbsync(void) { /* - * tlbsync waits for invlpgb instructions originating on the - * same CPU to have completed. Print a warning if we could have - * migrated, and might not be waiting on all the invlpgbs issued - * during this TLB invalidation sequence. + * TLBSYNC waits for INVLPGB instructions originating on the same CPU + * to have completed. Print a warning if the task has been migrated, + * and might not be waiting on all the INVLPGBs issued during this TLB + * invalidation sequence. */ cant_migrate(); @@ -106,6 +110,13 @@ static inline void invlpgb_flush_single_pcid_nosync(unsigned long pcid) /* Flush all mappings, including globals, for all PCIDs. */ static inline void invlpgb_flush_all(void) { + /* + * TLBSYNC at the end needs to make sure all flushes done on the + * current CPU have been executed system-wide. Therefore, make + * sure nothing gets migrated in-between but disable preemption + * as it is cheaper. + */ + guard(preempt)(); __invlpgb(0, 0, 0, 1, 0, INVLPGB_INCLUDE_GLOBAL); __tlbsync(); } @@ -119,10 +130,7 @@ static inline void invlpgb_flush_addr_nosync(unsigned long addr, u16 nr) /* Flush all mappings for all PCIDs except globals. */ static inline void invlpgb_flush_all_nonglobals(void) { - /* - * @addr=0 means both rax[1] (valid PCID) and rax[2] (valid ASID) are clear - * so flush *any* PCID and ASID. - */ + guard(preempt)(); __invlpgb(0, 0, 0, 1, 0, 0); __tlbsync(); } diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index f49627e02311..8cd084bc3d98 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1075,19 +1075,11 @@ void flush_tlb_all(void) count_vm_tlb_event(NR_TLB_REMOTE_FLUSH); /* First try (faster) hardware-assisted TLB invalidation. */ - if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) { - /* - * TLBSYNC at the end needs to make sure all flushes done - * on the current CPU have been executed system-wide. - * Therefore, make sure nothing gets migrated - * in-between but disable preemption as it is cheaper. - */ - guard(preempt)(); + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) invlpgb_flush_all(); - } else { + else /* Fall back to the IPI-based invalidation. */ on_each_cpu(do_flush_tlb_all, NULL, 1); - } } /* Flush an arbitrarily large range of memory with INVLPGB. */ -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette