Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes: > On Wed, Sep 20, 2017 at 06:13:50PM +0000, Mathieu Desnoyers wrote: > >> > Also, can you elaborate on the PPC issue? PPC appears to track >> > mm_cpumask more or less just like x86. Is the issue just that this >> > tracking has no implied barriers? If so, how does TLB flush on ppc >> > work? It really does seem impressive to me that an architecture can >> > efficiently support munmap() but not an expedited private membarrier. >> >> I'll leave this question to the PPC experts :) > > IIRC PPC does not keep a tight mm_cpumask, it only sets bit, it never > clears bits. The atomic op required to set bits does not imply any > memory barrier on PPC. Yep. We do have a full barrier now when we set a bit, but not if the bit was already set. > TLB invalidation is a TLBI instruction, it sends TLBI broadcast packets > over the interconnect, it doesn't require IPIs like x86. Yep. > The only optimization PPC does is that if the mm_cpumask has only a > single bit set, it uses a TLBI instruction without broadcast, which is > cheaper. Yep. We would like to trim the mm_cpumask, but it's one of those hairy optimisations we have never quite found time to do. I've been away for two weeks so I've not been able to keep up with the membarrier discussions. Will try and page it back in next week. cheers