On Thu, Aug 19, 2021 at 8:21 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote: > > If we can skip the OF... we can do something like this: Honestly, I think a lot of the refcount code is questionable. It was absolutely written with no care for performance AT ALL. I'm not sure it helps to then add arch-specific code for it without thinking it through a _lot_ first. It might be better to just have a "atomic_t with overflow handling" in general, exactly because the refcount_t was designed and written without any regard for code that cares about performance. > static inline bool refcount_dec_and_test(refcount_t *r) > { > asm_volatile_goto (LOCK_PREFIX "decl %[var]\n\t" > "jz %l[cc_zero]\n\t" > "jns 1f\n\t" I think you can use "jl" for the bad case. You want to fail if the old value was negative, or if the old value was less than what you subtracted. That's literally the "less than" operator. I think it's better to handle that case out-of-line than play games with UD, though - this is going to be the rare case, the likelihood that we get the fixup wrong is just too big. Once it's out-of-line it's not as critical any more, even if it does add to the size of the code. So if we do this, I think it should be something like static inline __must_check bool refcount_dec_and_test(refcount_t *r) { asm_volatile_goto (LOCK_PREFIX "decl %[var]\n\t" "jz %l[cc_zero]\n\t" "jl %l[cc_error]" : : [var] "m" (r->refs.counter) : "memory" : cc_zero, cc_error); return false; cc_zero: return true; cc_error: refcount_warn_saturate(r, REFCOUNT_SUB_UAF); return false; } and we can discuss whether we could improve on the refcount_warn_saturate() separately. But see above: maybe just make this a separate "careful atomic_t", with the option to panic-on-overflow. So then we could get rid of refcount_warn_saturate() enmtirely above, and instead just have a (compile-time option) BUG() case, with the non-careful version just being our existing atomic_dec_and_test. Giving people who want a smaller kernel the option to do so, while giving people who want checking the option too. Linus