On Thu, Jul 06, 2023 at 10:28:49AM +0000, Alan Huang wrote: > The objects are allocated with SLAB_TYPESAFE_BY_RCU, and there is > n->next = first within hlist_add_head_rcu() before rcu_assign_pointer(), > which modifies obj->obj_node.next. There may be readers holding the > reference of obj in lockless_lookup, and when updater modifies ->next, > readers can see the change immediately because of SLAB_TYPESAFE_BY_RCU. > > There are two memory ordering required in the insertion algorithm, > we need to make sure obj->key is updated before obj->obj_node.next > and obj->refcnt, atomic_set_release is not enough to provide the > required memory barrier. > > Signed-off-by: Alan Huang <mmpgouride@xxxxxxxxx> > --- > Documentation/RCU/rculist_nulls.rst | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/Documentation/RCU/rculist_nulls.rst b/Documentation/RCU/rculist_nulls.rst > index 21e40fcc08de..fa3729dc7e74 100644 > --- a/Documentation/RCU/rculist_nulls.rst > +++ b/Documentation/RCU/rculist_nulls.rst > @@ -64,7 +64,7 @@ but a version with an additional memory barrier (smp_rmb()) > { > struct hlist_node *node, *next; > for (pos = rcu_dereference((head)->first); > - pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) && > + pos && ({ next = READ_ONCE(pos->next); smp_rmb(); prefetch(next); 1; }) && This one looks good, though the READ_ONCE() becoming rcu_dereference() would allow the smp_rmb() to be dropped, correct? > ({ obj = hlist_entry(pos, typeof(*obj), obj_node); 1; }); > pos = rcu_dereference(next)) > if (obj->key == key) > @@ -112,7 +112,12 @@ detect the fact that it missed following items in original chain. > obj = kmem_cache_alloc(...); > lock_chain(); // typically a spin_lock() > obj->key = key; > - atomic_set_release(&obj->refcnt, 1); // key before refcnt > + /* > + * We need to make sure obj->key is updated before obj->obj_node.next > + * and obj->refcnt. > + */ > + smp_wmb(); > + atomic_set(&obj->refcnt, 1); ...but what is smp_wmb() doing that the combination of atomic_set_release() and hlist_add_head_rcu() was not already doing? What am I missing? Thanx, Paul > hlist_add_head_rcu(&obj->obj_node, list); > unlock_chain(); // typically a spin_unlock() > > -- > 2.34.1 >