The objects are allocated with SLAB_TYPESAFE_BY_RCU, and there is n->next = first within hlist_add_head_rcu() before rcu_assign_pointer(), which modifies obj->obj_node.next. There may be readers holding the reference of obj in lockless_lookup, and when updater modifies ->next, readers can see the change immediately because of SLAB_TYPESAFE_BY_RCU. There are two memory ordering required in the insertion algorithm, we need to make sure obj->key is updated before obj->obj_node.next and obj->refcnt, atomic_set_release is not enough to provide the required memory barrier. Signed-off-by: Alan Huang <mmpgouride@xxxxxxxxx> --- Changelog: v1 -> v2: Use _ONCE to protect obj->key. Documentation/RCU/rculist_nulls.rst | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/Documentation/RCU/rculist_nulls.rst b/Documentation/RCU/rculist_nulls.rst index 21e40fcc08de..2a9f5a63d334 100644 --- a/Documentation/RCU/rculist_nulls.rst +++ b/Documentation/RCU/rculist_nulls.rst @@ -47,7 +47,7 @@ objects, which is having below type. * reuse these object before the RCU grace period, we * must check key after getting the reference on object */ - if (obj->key != key) { // not the object we expected + if (READ_ONCE(obj->key) != key) { // not the object we expected put_ref(obj); rcu_read_unlock(); goto begin; @@ -64,10 +64,10 @@ but a version with an additional memory barrier (smp_rmb()) { struct hlist_node *node, *next; for (pos = rcu_dereference((head)->first); - pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) && + pos && ({ next = READ_ONCE(pos->next); smp_rmb(); prefetch(next); 1; }) && ({ obj = hlist_entry(pos, typeof(*obj), obj_node); 1; }); pos = rcu_dereference(next)) - if (obj->key == key) + if (READ_ONCE(obj->key) == key) return obj; return NULL; } @@ -111,8 +111,13 @@ detect the fact that it missed following items in original chain. */ obj = kmem_cache_alloc(...); lock_chain(); // typically a spin_lock() - obj->key = key; - atomic_set_release(&obj->refcnt, 1); // key before refcnt + WRITE_ONCE(obj->key, key); + /* + * We need to make sure obj->key is updated before obj->obj_node.next + * and obj->refcnt. + */ + smp_wmb(); + atomic_set(&obj->refcnt, 1); hlist_add_head_rcu(&obj->obj_node, list); unlock_chain(); // typically a spin_unlock() @@ -165,12 +170,12 @@ Note that using hlist_nulls means the type of 'obj_node' field of begin: rcu_read_lock(); hlist_nulls_for_each_entry_rcu(obj, node, head, obj_node) { - if (obj->key == key) { + if (READ_ONCE(obj->key) == key) { if (!try_get_ref(obj)) { // might fail for free objects rcu_read_unlock(); goto begin; } - if (obj->key != key) { // not the object we expected + if (READ_ONCE(obj->key) != key) { // not the object we expected put_ref(obj); rcu_read_unlock(); goto begin; @@ -206,7 +211,7 @@ hlist_add_head_rcu(). */ obj = kmem_cache_alloc(cachep); lock_chain(); // typically a spin_lock() - obj->key = key; + WRITE_ONCE(obj->key, key); atomic_set_release(&obj->refcnt, 1); // key before refcnt /* * insert obj in RCU way (readers might be traversing chain) -- 2.34.1