On Mon, 2025-01-20 at 11:56 +0200, Nadav Amit wrote: > > > @@ -1670,12 +1668,62 @@ void arch_tlbbatch_flush(struct > > arch_tlbflush_unmap_batch *batch) > > local_irq_enable(); > > } > > > > + /* > > + * If we issued (asynchronous) INVLPGB flushes, wait for > > them here. > > + * The cpumask above contains only CPUs that were running > > tasks > > + * not using broadcast TLB flushing. > > + */ > > + if (cpu_feature_enabled(X86_FEATURE_INVLPGB) && batch- > > >used_invlpgb) { > > + tlbsync(); > > + migrate_enable(); > > Maybe someone mentioned it before, but I would emphasize that I do > not > think that preventing migration for potentially long time is that > great. > > One alternative solution would be to set a bit on cpu_tlbstate, that > when set, you'd issue a tlbsync on context switch. > > (I can think about other solutions, but I think the one I just > mentioned > is the cleanest one). It is clean, but I'm not convinced it is good enough. We need to guarantee that the INVLPGBs have finished before we free the pages. Running a TLBSYNC at the next context switch could mean that TLBSYNC won't run until after the pages have been freed. In practice it is probably good enough, since it would be simpler for TLBSYNC to return once all pending (older) INVLPGBs have finished, but it's not architecturally guaranteed. We could send an IPI to remote CPUs in order for them to call TLBSYNC, but is that really better? -- All Rights Reversed.