On Wed, 08 Nov 2017, Boqun Feng wrote:
Or worse: * CPU0 CPU1 * dlock_list_add() dlock_lists_empty() * smp_mb__before_atomic(); * [L] atomic_read(used_lists) * [S] atomic_inc(used_lists); * smp_mb__after_atomic(); , in which case dlock_lists_empty() misses a increment of used_lists. That said, this may be fine for the usage of dlock_list. But I think the comments are confusing or wrong. The reason is you can not use barriers to order two memory ops on different CPUs, and usually we add comments like:
So I think that case is OK as CPU1 legitimately reads a 0, the problem would be if we miss the inc it because we are doing spin_lock().
[S] ... [S] ... <barrier> <barrier> [L] ... [L] ...
That is true.
Davidlohr, could you try to improve the comments by finding both memory ops preceding and following the barriers? Maybe you will find those barriers are not necessary in the end.
Ok so for the dlock_list_add() the first barrier is so that the atomic_inc() is not done inside the critical region, after the head->lock is taken. The other barriers that follow I put such that the respective atomic op is done before the list_add(), however thinking about it I don't think they are really needed. Regarding dlock_list_empty(), the smp_mb__before_atomic() is mainly for pairing reasons; but at this point I don't see a respective counterpart for the above diagram so I'm unclear. Thanks, Davidlohr
Attachment:
signature.asc
Description: Digital signature