On Tue, Feb 11, 2025 at 04:08:04PM -0500, Rik van Riel wrote: > +static void broadcast_tlb_flush(struct flush_tlb_info *info) > +{ > + bool pmd = info->stride_shift == PMD_SHIFT; > + unsigned long maxnr = invlpgb_count_max; > + unsigned long asid = info->mm->context.global_asid; > + unsigned long addr = info->start; > + unsigned long nr; > + > + /* Flushing multiple pages at once is not supported with 1GB pages. */ > + if (info->stride_shift > PMD_SHIFT) > + maxnr = 1; How does this work? Normally, if we get a 1GB range, we'll iterate on the stride and INVLPG each one (just like any other stride). Should you not instead either force the stride down to PMD level or force a full flush? > + > + /* > + * TLB flushes with INVLPGB are kicked off asynchronously. > + * The inc_mm_tlb_gen() guarantees page table updates are done > + * before these TLB flushes happen. > + */ > + if (info->end == TLB_FLUSH_ALL) { > + invlpgb_flush_single_pcid_nosync(kern_pcid(asid)); > + /* Do any CPUs supporting INVLPGB need PTI? */ > + if (static_cpu_has(X86_FEATURE_PTI)) > + invlpgb_flush_single_pcid_nosync(user_pcid(asid)); > + } else do { > + /* > + * Calculate how many pages can be flushed at once; if the > + * remainder of the range is less than one page, flush one. > + */ > + nr = min(maxnr, (info->end - addr) >> info->stride_shift); > + nr = max(nr, 1); > + > + invlpgb_flush_user_nr_nosync(kern_pcid(asid), addr, nr, pmd); > + /* Do any CPUs supporting INVLPGB need PTI? */ > + if (static_cpu_has(X86_FEATURE_PTI)) > + invlpgb_flush_user_nr_nosync(user_pcid(asid), addr, nr, pmd); > + > + addr += nr << info->stride_shift; > + } while (addr < info->end); > + > + finish_asid_transition(info); > + > + /* Wait for the INVLPGBs kicked off above to finish. */ > + tlbsync(); > +}