On Thu, Aug 19, 2021 at 04:06:45PM +0200, Peter Zijlstra wrote: > > > > * We can implement (1) by checking if we hit zero (ZF=1) > > * We can implement (2) by checking if the new value is < 0 (SF=1). > > We then need to catch the case where the old value was < 0 but the > > new value is 0. I think this is (SF=0 && OF=1). > > > > So maybe the second check is actually SF != OF? I could benefit from some > > x86 expertise here, but hopefully you get the idea. > > Right, so the first condition is ZF=1, we hit zero. > The second condition is SF=1, the result is negative. > > I'm not sure we need OF, if we hit that condition we've already lost. > But it's easy enough to add I suppose. If we can skip the OF... we can do something like this: static inline bool refcount_dec_and_test(refcount_t *r) { asm_volatile_goto (LOCK_PREFIX "decl %[var]\n\t" "jz %l[cc_zero]\n\t" "jns 1f\n\t" "ud1 %[var], %%ebx\n\t" "1:" : : [var] "m" (r->refs.counter) : "memory" : cc_zero); return false; cc_zero: smp_acquire__after_ctrl_dep(); return true; } where we encode the whole refcount_warn_saturate() thing into UD1. The first argument is @r and the second argument the REFCOUNT_* thing encoded in register space. It would mean adding something 'clever' to the #UD handler that decodes the trapping instruction and extracts these arguments, but this is the smallest I could get it.