On Wed, Jan 15, 2025 at 12:13:34PM +0100, Peter Zijlstra wrote: > Notably, it means refcount_t is entirely unsuitable for anything > SLAB_TYPESAFE_BY_RCU, since they all will need secondary validation > conditions after the refcount succeeds. > > And this is probably fine, but let me ponder this all a little more. Even though SLAB_TYPESAFE_BY_RCU is relatively rare, I think we'd better fix this, these things are hard enough as they are. Will, others, wdyt? --- Subject: refcount: Strengthen inc_not_zero() For speculative lookups where a successful inc_not_zero() pins the object, but where we still need to double check if the object acquired is indeed the one we set out to aquire, needs this validation to happen *after* the increment. Notably SLAB_TYPESAFE_BY_RCU is one such an example. Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> --- include/linux/refcount.h | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/include/linux/refcount.h b/include/linux/refcount.h index 35f039ecb272..340e7ffa445e 100644 --- a/include/linux/refcount.h +++ b/include/linux/refcount.h @@ -69,9 +69,10 @@ * its the lock acquire, for RCU/lockless data structures its the dependent * load. * - * Do note that inc_not_zero() provides a control dependency which will order - * future stores against the inc, this ensures we'll never modify the object - * if we did not in fact acquire a reference. + * Do note that inc_not_zero() does provide acquire order, which will order + * future load and stores against the inc, this ensures all subsequent accesses + * are from this object and not anything previously occupying this memory -- + * consider SLAB_TYPESAFE_BY_RCU. * * The decrements will provide release order, such that all the prior loads and * stores will be issued before, it also provides a control dependency, which @@ -144,7 +145,7 @@ bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp) do { if (!old) break; - } while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i)); + } while (!atomic_try_cmpxchg_acquire(&r->refs, &old, old + i)); if (oldp) *oldp = old; @@ -225,9 +226,9 @@ static inline __must_check bool __refcount_inc_not_zero(refcount_t *r, int *oldp * Similar to atomic_inc_not_zero(), but will saturate at REFCOUNT_SATURATED * and WARN. * - * Provides no memory ordering, it is assumed the caller has guaranteed the - * object memory to be stable (RCU, etc.). It does provide a control dependency - * and thereby orders future stores. See the comment on top. + * Provides acquire ordering, such that subsequent accesses are after the + * increment. This is important for the cases where secondary validation is + * required, eg. SLAB_TYPESAFE_BY_RCU. * * Return: true if the increment was successful, false otherwise */