Re: [PATCH v6 0/2] x86: Implement fast refcount overflow protection

Michael Ellerman <mpe@xxxxxxxxxxxxxx> · Mon, 24 Jul 2017 22:09:32 +1000

Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:

> On Mon, Jul 24, 2017 at 04:38:06PM +1000, Michael Ellerman wrote:
>
>> What I'm not entirely clear on is what the best trade off is in terms of
>> overhead vs checks. The summary of behaviour between the fast and full
>> versions you promised Ingo will help there I think.
>
> That's something that's probably completely different for PPC than it is
> for x86.

Yeah definitely. I guess I see the x86 version as a lower bound on the
semantics we'd need to implement and still claim to implement the
refcount stuff.

> Both because your primitive is LL/SC and thus the saturation
> semantics we need a cmpxchg loop for are more natural in your case

Yay!

> anyway, and the fact that your LL/SC is horrendously slow in any case.

Boo :/

Just kidding. I suspect you're right that we can probably pack a
reasonable amount of tests in the body of the LL/SC and not notice.

> Also, I still haven't seen an actual benchmark where our cmpxchg loop
> actually regresses anything, just a lot of yelling about potential
> regressions :/

Heh yeah. Though I have looked at the code it generates on PPC and it's
not sleek, though I guess that's not a benchmark is it :)

cheers