On Sun, Nov 29, 2020 at 12:16:26PM -0800, Andy Lutomirski wrote: > On Sat, Nov 28, 2020 at 7:54 PM Andy Lutomirski <luto@xxxxxxxxxx> wrote: > > > > On Sat, Nov 28, 2020 at 8:02 AM Nicholas Piggin <npiggin@xxxxxxxxx> wrote: > > > > > > On big systems, the mm refcount can become highly contented when doing > > > a lot of context switching with threaded applications (particularly > > > switching between the idle thread and an application thread). > > > > > > Abandoning lazy tlb slows switching down quite a bit in the important > > > user->idle->user cases, so so instead implement a non-refcounted scheme > > > that causes __mmdrop() to IPI all CPUs in the mm_cpumask and shoot down > > > any remaining lazy ones. > > > > > > Shootdown IPIs are some concern, but they have not been observed to be > > > a big problem with this scheme (the powerpc implementation generated > > > 314 additional interrupts on a 144 CPU system during a kernel compile). > > > There are a number of strategies that could be employed to reduce IPIs > > > if they turn out to be a problem for some workload. > > > > I'm still wondering whether we can do even better. > > > > Hold on a sec.. __mmput() unmaps VMAs, frees pagetables, and flushes > the TLB. On x86, this will shoot down all lazies as long as even a > single pagetable was freed. (Or at least it will if we don't have a > serious bug, but the code seems okay. We'll hit pmd_free_tlb, which > sets tlb->freed_tables, which will trigger the IPI.) So, on > architectures like x86, the shootdown approach should be free. The > only way it ought to have any excess IPIs is if we have CPUs in > mm_cpumask() that don't need IPI to free pagetables, which could > happen on paravirt. > > Can you try to figure out why you saw any increase in IPIs? It would > be nice if we can make the new code unconditional. Power doesn't do IPI based TLBI.