On Mon, Jul 24, 2017 at 04:38:06PM +1000, Michael Ellerman wrote: > What I'm not entirely clear on is what the best trade off is in terms of > overhead vs checks. The summary of behaviour between the fast and full > versions you promised Ingo will help there I think. That's something that's probably completely different for PPC than it is for x86. Both because your primitive is LL/SC and thus the saturation semantics we need a cmpxchg loop for are more natural in your case anyway, and the fact that your LL/SC is horrendously slow in any case. Also, I still haven't seen an actual benchmark where our cmpxchg loop actually regresses anything, just a lot of yelling about potential regressions :/