On Mon, Dec 06, 2021 at 04:13:00PM -0800, Linus Torvalds wrote: > Fix it to not unnecessarily use expensive compare-and-exchange loops, > when you can safely just race a bit, safe in the knowledge that you're > not going to race 2**31 times. > > IOW, I think that "try_get_page()" function is basically the *much* > superior version of what is currently a broken "refcount_inc()". > > And yes, it does warn about that overflow case that you claim only > refcount_t does. And does so without the broken semantics that > refcount h as. I think you should perhaps look at the current code again. Those cmpxchg loops are gone. The only cmpxchg loop that's still there is for add_not_zero(), and that is the exact same loop we have for atomic_inc_not_zero(v) := atomic_add_unless(v,1,0) := atomic_fetch_add_unless(v,1,0) != 0. The rest is exactly that LOCK XADD and assume you don't race 2^31 bits worth. Now, it could be GCC generates atrociously bad code simply because it doesn't know it can use the flags from the XADD in which case we can look at doing an arch asm implementation.