On Fri, Jan 17, 2025 at 03:41:36PM +0000, Will Deacon wrote: > On Wed, Jan 15, 2025 at 05:00:11PM +0100, Peter Zijlstra wrote: > > On Wed, Jan 15, 2025 at 12:13:34PM +0100, Peter Zijlstra wrote: > > > > > Notably, it means refcount_t is entirely unsuitable for anything > > > SLAB_TYPESAFE_BY_RCU, since they all will need secondary validation > > > conditions after the refcount succeeds. > > > > > > And this is probably fine, but let me ponder this all a little more. > > > > Even though SLAB_TYPESAFE_BY_RCU is relatively rare, I think we'd better > > fix this, these things are hard enough as they are. > > > > Will, others, wdyt? > > We should also update the Documentation (atomic_t.txt and > refcount-vs-atomic.rst) if we strengthen this. > > > --- > > Subject: refcount: Strengthen inc_not_zero() > > > > For speculative lookups where a successful inc_not_zero() pins the > > object, but where we still need to double check if the object acquired > > is indeed the one we set out to aquire, needs this validation to happen > > *after* the increment. > > > > Notably SLAB_TYPESAFE_BY_RCU is one such an example. > > > > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> > > --- > > include/linux/refcount.h | 15 ++++++++------- > > 1 file changed, 8 insertions(+), 7 deletions(-) > > > > diff --git a/include/linux/refcount.h b/include/linux/refcount.h > > index 35f039ecb272..340e7ffa445e 100644 > > --- a/include/linux/refcount.h > > +++ b/include/linux/refcount.h > > @@ -69,9 +69,10 @@ > > * its the lock acquire, for RCU/lockless data structures its the dependent > > * load. > > * > > - * Do note that inc_not_zero() provides a control dependency which will order > > - * future stores against the inc, this ensures we'll never modify the object > > - * if we did not in fact acquire a reference. > > + * Do note that inc_not_zero() does provide acquire order, which will order > > + * future load and stores against the inc, this ensures all subsequent accesses > > + * are from this object and not anything previously occupying this memory -- > > + * consider SLAB_TYPESAFE_BY_RCU. > > * > > * The decrements will provide release order, such that all the prior loads and > > * stores will be issued before, it also provides a control dependency, which > > @@ -144,7 +145,7 @@ bool __refcount_add_not_zero(int i, refcount_t *r, int *oldp) > > do { > > if (!old) > > break; > > - } while (!atomic_try_cmpxchg_relaxed(&r->refs, &old, old + i)); > > + } while (!atomic_try_cmpxchg_acquire(&r->refs, &old, old + i)); > > Hmm, do the later memory accesses need to be ordered against the store > part of the increment or just the read? If it's the former, then I don't > think that _acquire is sufficient -- accesses can still get in-between > the read and write parts of the RmW. I dug some more into this at the end of last week. For the SLAB_TYPESAFE_BY_RCU where we're racing inc_not_zero() with dec_and_test(), then I think using _acquire() above is correct as the later references can only move up into the critical section in the case that we successfully obtained a reference. However, if we're going to make the barriers implicit in the refcount operations here then I think we also need to do something on the producer side for when the object is re-initialised after being destroyed and allocated again. I think that would necessitate release ordering for refcount_set() so that whatever allows the consumer to validate the object (e.g. sequence number) is published *before* the refcount. Will