Re: [patch 119/212] lazy tlb: shoot lazies, a non-refcounting lazy tlb option

Nicholas Piggin <npiggin@xxxxxxxxx> · Fri, 03 Sep 2021 10:48:03 +1000

Excerpts from Andy Lutomirski's message of September 3, 2021 8:28 am:
> 
> 
> On Thu, Sep 2, 2021, at 2:56 PM, Andrew Morton wrote:
>> From: Nicholas Piggin <npiggin@xxxxxxxxx>
>> Subject: lazy tlb: shoot lazies, a non-refcounting lazy tlb option
>> 
>> On big systems, the mm refcount can become highly contented when doing a
>> lot of context switching with threaded applications (particularly
>> switching between the idle thread and an application thread).
>> 
>> Abandoning lazy tlb slows switching down quite a bit in the important
>> user->idle->user cases, so instead implement a non-refcounted scheme that
>> causes __mmdrop() to IPI all CPUs in the mm_cpumask and shoot down any
>> remaining lazy ones.
>> 
>> Shootdown IPIs are some concern, but they have not been observed to be a
>> big problem with this scheme (the powerpc implementation generated 314
>> additional interrupts on a 144 CPU system during a kernel compile).  There
>> are a number of strategies that could be employed to reduce IPIs if they
>> turn out to be a problem for some workload.
> 
> This pile is:
> 
> Nacked-by: Andy Lutomirski <luto@xxxxxxxxxx>
> 
> For reasons that have been discussed previously. My series is still in progress.  It’s moving slowly for two reasons.  First, I have limited time to work on it. Second, the existing mm refcounting is a giant pile of worms, and that needs fixing one way or another before we add yet more complexity. For example, has anyone noticed that kthread mms are refcounted using different rules than everything else?

It's been like a year with ~no progress. mm refcounting is not a pile of  
worms, as you can see in my series it's pretty simple. The _x86_ mm 
refcounting is a huge pile of crap, but that doesn't give reason to nack 
this series.

> 
> Even if my modified refcounting scheme isn’t the eventual winner, the prerequisite cleanups are still prerequisites. I absolutely nack anything that adds yet more nonsensical complexity to the existing scheme, makes it substantially more fragile, and does not fix the underlying crap that makes speeding up responsibly such a mess.
> 
> Nick or anyone else, you’re welcome to finish up my series (and I can give pointers) or you can wait.

You or anyone else is welcome to rebase your series on top of mine.

Thanks,
Nick