On 10/2/24 4:11 PM, Dave Hansen wrote:
About TLB flushing... The quick and dirty thing to do is just flush_tlb_all() after you remove the PTE from the host mm. That will surely work everywhere and it's as dirt simple as you get. Honestly, it might even be cheaper than the alternative.
I think this a good place to start from. My concern is that unnecessary flushes will potentially impact unrelated loads. Performance testing as things progress can help determine if a more involved approach is needed.
Also, I don't think PCIDs actually complicate the problem at all. We basically do remote mm TLB flushes using two mechanisms: 1. If the mm is loaded, use INVLPG and friends to zap the TLB 2. Bump mm->context.tlb_gen so that the next time it _gets_ loaded, the TLB is flushed. flush_tlb_func() really only cares about #1 since if the mm isn't loaded, it'll get flushed anyway at the next context switch. The alternatives I can think of: Make flush_tlb_mm_range(host_mm) work somehow. You'd need to somehow keep mm_cpumask(host_mm) up to date and also make do something to flush_tlb_func() to tell it that 'loaded_mm' isn't relevant and it should flush regardless. The other way is to use the msharefs's inode ->i_mmap to find all the VMAs mapping the file, and find all *their* mm's: for each vma in inode->i_mmap mm = vma->vm_mm flush_tlb_mm_range(<vma range here>) But that might be even worse than flush_tlb_all() because it might end up sending more than one IPI per CPU. You can fix _that_ by keeping a single cpumask that you build up: mask = 0 for each vma in inode->i_mmap mm = vma->vm_mm mask |= mm_cpumask(mm) flush_tlb_multi(mask, info); Unfortunately, 'info->mm' needs to be more than one mm, so you probably still need a new flush_tlb_func() flush type to tell it to ignore 'info->mm' and flush anyway.
What about something like arch_tlbbatch_flush() which sets info->mm to NULL and uses a batch cpumask? That seems perfect though there would be a bit more work needed to ensure things work on other architectures. flush_tlb_all() it is then, for now. :-)
After all that, I kinda like flush_tlb_all(). ;)