On Thu, 2017-09-21 at 15:09 +0200, Peter Zijlstra wrote: > On Wed, Sep 20, 2017 at 06:13:50PM +0000, Mathieu Desnoyers wrote: > > > > > > > > > Also, can you elaborate on the PPC issue? PPC appears to track > > > mm_cpumask more or less just like x86. Is the issue just that > > > this > > > tracking has no implied barriers? If so, how does TLB flush on > > > ppc > > > work? It really does seem impressive to me that an architecture > > > can > > > efficiently support munmap() but not an expedited private > > > membarrier. > > > > I'll leave this question to the PPC experts :) > > IIRC PPC does not keep a tight mm_cpumask, it only sets bit, it never > clears bits. The atomic op required to set bits does not imply any > memory barrier on PPC. > > TLB invalidation is a TLBI instruction, it sends TLBI broadcast > packets over the interconnect, it doesn't require IPIs like x86. I believe this to be true for all SMP RISC systems ... it's certainly true for PA-RISC as well. There are so many RISC coherency issues that the CPUs pretty much have to have a private bus to broadcast and interlock coherency operations. We have one system that locks up if multiple CPUs have outstanding coherency operations on the private bus, but that's only one annoying CPU (which we manage with a special lock inside the PA-RISC mmu code). James