On Wed, Sep 20, 2017 at 06:13:50PM +0000, Mathieu Desnoyers wrote: > > Also, can you elaborate on the PPC issue? PPC appears to track > > mm_cpumask more or less just like x86. Is the issue just that this > > tracking has no implied barriers? If so, how does TLB flush on ppc > > work? It really does seem impressive to me that an architecture can > > efficiently support munmap() but not an expedited private membarrier. > > I'll leave this question to the PPC experts :) IIRC PPC does not keep a tight mm_cpumask, it only sets bit, it never clears bits. The atomic op required to set bits does not imply any memory barrier on PPC. TLB invalidation is a TLBI instruction, it sends TLBI broadcast packets over the interconnect, it doesn't require IPIs like x86. The only optimization PPC does is that if the mm_cpumask has only a single bit set, it uses a TLBI instruction without broadcast, which is cheaper.