Excerpts from Linus Torvalds's message of January 10, 2022 7:51 am: > [ Ugh, I actually went back and looked at Nick's patches again, to > just verify my memory, and they weren't as pretty as I thought they > were ] > > On Sun, Jan 9, 2022 at 12:48 PM Linus Torvalds > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: >> >> I'd much rather have a *much* smaller patch that says "on x86 and >> powerpc, we don't need this overhead at all". > > For some reason I thought Nick's patch worked at "last mmput" time and > the TLB flush IPIs that happen at that point anyway would then make > sure any lazy TLB is cleaned up. > > But that's not actually what it does. It ties the > MMU_LAZY_TLB_REFCOUNT to an explicit TLB shootdown triggered by the > last mmdrop() instead. Because it really tied the whole logic to the > mm_count logic (and made lazy tlb to not do mm_count) rather than the > mm_users thing I mis-remembered it doing. It does this because on powerpc with hash MMU, we can't use IPIs for TLB shootdowns. > So at least some of my arguments were based on me just mis-remembering > what Nick's patch actually did (mainly because I mentally recreated > the patch from "Nick did something like this" and what I thought would > be the way to do it on x86). With powerpc with the radix MMU using IPI based shootdowns, we can actually do the switch-away-from-lazy on the final TLB flush and the final broadcast shootdown thing becomes a no-op. I didn't post that additional patch because it's powerpc-specific and I didn't want to post more code so widely. > So I guess I have to recant my arguments. > > I still think my "get rid of lazy at last mmput" model should work, > and would be a perfect match for x86, but I can't really point to Nick > having done that. > > So I was full of BS. > > Hmm. I'd love to try to actually create a patch that does that "Nick > thing", but on last mmput() (ie when __mmput triggers). Because I > think this is interesting. But then I look at my schedule for the > upcoming week, and I go "I don't have a leg to stand on in this > discussion, and I'm just all hot air". I agree Andy's approach is very complicated and adds more overhead than necessary for powerpc, which is why I don't want to use it. I'm still not entirely sure what the big problem would be to convert x86 to use it, I admit I haven't kept up with the exact details of its lazy tlb mm handling recently though. Thanks, Nick