> 2023年6月18日 02:07,Matthew Wilcox <willy@xxxxxxxxxxxxx> 写道: > > On Sun, Jun 18, 2023 at 01:55:10AM +0800, Alan Huang wrote: >> >>> 2023年6月18日 01:23,Matthew Wilcox <willy@xxxxxxxxxxxxx> 写道: >>> >>> On Sat, Jun 17, 2023 at 02:53:46PM +0000, Alan Huang wrote: >>>> There are two memory ordering required in the insertion algorithm, >>>> we need to make sure obj->key is updated before obj->obj_node.next >>>> and obj->refcnt, atomic_set_release is not enough to provide the >>>> required memory barrier. >>>> >>>> Signed-off-by: Alan Huang <mmpgouride@xxxxxxxxx> >>>> --- >>>> Documentation/RCU/rculist_nulls.rst | 9 +++++++-- >>>> 1 file changed, 7 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/Documentation/RCU/rculist_nulls.rst b/Documentation/RCU/rculist_nulls.rst >>>> index e06ed40bb6..77244adbdf 100644 >>>> --- a/Documentation/RCU/rculist_nulls.rst >>>> +++ b/Documentation/RCU/rculist_nulls.rst >>>> @@ -98,7 +98,7 @@ Quoting Corey Minyard:: >>>> ---------------------- >>>> >>>> We need to make sure a reader cannot read the new 'obj->obj_node.next' value >>>> -and previous value of 'obj->key'. Otherwise, an item could be deleted >>>> +and previous value of 'obj->key' at the same time. Otherwise, an item could be deleted >>> >>> "at the same time" doesn't make a lot of sense to me. CPUs don't do >>> anything "at the same time". I think the way this is worded now is >>> fine; I tried coming up with a few variants of this, but none are as >>> clear and succinct as what is there now. >>> >>>> from a chain, and inserted into another chain. If new chain was empty >>>> before the move, 'next' pointer is NULL, and lockless reader can not >>>> detect the fact that it missed following items in original chain. >>>> @@ -112,7 +112,12 @@ detect the fact that it missed following items in original chain. >>>> obj = kmem_cache_alloc(...); >>>> lock_chain(); // typically a spin_lock() >>>> obj->key = key; >>>> - atomic_set_release(&obj->refcnt, 1); // key before refcnt >>>> + /* >>>> + * we need to make sure obj->key is updated before obj->obj_node.next >>>> + * and obj->refcnt >>>> + */ >>>> + smp_wmb(); >>>> + atomic_set(&obj->refcnt, 1); >>> >>> Perhaps this could be a little clearer for those of us who aren't as >>> deep into the memory model as others ... are you saying that the >>> atomic_set_release() would only order references to obj->refcount and >>> would not order accesses to obj->key? Because reading this: >>> >>> The use of ACQUIRE and RELEASE operations generally precludes the need >>> for other sorts of memory barrier. In addition, a RELEASE+ACQUIRE pair is >>> -not- guaranteed to act as a full memory barrier. However, after an >>> ACQUIRE on a given variable, all memory accesses preceding any prior >>> RELEASE on that same variable are guaranteed to be visible. In other >>> words, within a given variable's critical section, all accesses of all >>> previous critical sections for that variable are guaranteed to have >>> completed. >>> >>> makes me think that this example is fine; anybody doing a load-acquire >>> on obj->refcount will see the update to obj->key that happened before >>> the store-release to obj->refcount. >> >> Two memory ordering required here, atomic_set_release only provides one of them (the one you mentioned) >> >> The objects are allocated with SLAB_TYPESAFE_BY_RCU, and there is: >> >> n->next = first; >> >> in hlist_add_head_rcu, which modifies obj->obj_node.next. > > But isn't that the bug? ie this assignment should be WRITE_ONCE() or > smp_store_release() or rcu_assign_pointer() or whichever of these > similar functions is actually appropriate? Without SLAB_TYPESAFE_BY_RCU, it’s fine, because it only modifies local data, which has not been published. But with SLAB_TYPESAFE_BY_RCU, it might be a bug, but I’m not very sure. > >> So, we must make sure obj->key is updated before obj->obj_node.next, without smp_wmb(), we can read >> the new 'obj->obj_node.next’ value and previous value of 'obj->key’ at the same time, and in this case, we >> can not detect the movement of the object. >> >> The following link might be helpful: >> >> https://patchwork.ozlabs.org/project/netdev/patch/491C282A.5050802@xxxxxxxxxxxxx/ >> >> >>> >>> I am not an expert and would welcome the opportunity to learn more here.