On Tue, 2017-11-07 at 13:57 -0500, Waiman Long wrote: > On 11/07/2017 12:59 PM, Andreas Dilger wrote: > > > > On Nov 7, 2017, at 4:59 AM, Jan Kara <jack@xxxxxxx> wrote: > > > > > > On Mon 06-11-17 10:47:08, Davidlohr Bueso wrote: > > > > > > > > + /* > > > > + * Serialize dlist->used_lists such that a 0->1 > > > > transition is not > > > > + * missed by another thread checking if any of the > > > > dlock lists are > > > > + * used. > > > > + * > > > > + * CPU0 CPU1 > > > > + * > > > > dlock_list_add() dlock_lists_empty() > > > > + * [S] atomic_inc(used_lists); > > > > + * smp_mb__after_atomic(); > > > > + * smp_mb__be > > > > fore_atomic(); > > > > + * [L] > > > > atomic_read(used_lists) > > > > + * list_add() > > > > + */ > > > > + smp_mb__before_atomic(); > > > > + return !atomic_read(&dlist->used_lists); > > Just a general kernel programming question here - I thought the > > whole point of atomics is that they are, well, atomic across all > > CPUs so there is no need for a memory barrier? If there is a need > > for a memory barrier for each atomic access (assuming it isn't > > accessed under another lock, which would make the use of atomic > > types pointless, IMHO) then I'd think there is a lot of code in the > > kernel that isn't doing this properly. > > > > What am I missing here? > > Atomic update and memory barrier are 2 different things. Atomic > update means other CPUs see either the value before or after the > update. They won't see anything in between. For a counter, it means > we won't miss any counts. However, not all atomic operations give an > ordering guarantee. The atomic_read() and atomic_inc() are examples > that do not provide memory ordering guarantee. See > Documentation/memory-barriers.txt for more information about it. > > A CPU can perform atomic operations 1 & 2 in program order, but other > CPUs may see operation 2 first before operation 1. Here memory > barrier can be used to guarantee that other CPUs see the memory > updates in certain order. There's an omission here which I think Andreas may have been referring to: atomic_inc/dec operations *are* strongly ordered with respect to each other, so if two CPUs are simultaneously executing atomic_inc, the order in which they execute isn't guaranteed, but it is guaranteed that the losing atomic_inc will not begin until the winning one is completed, so after both are done the value will have +2. So although atomic_read and atomic_inc have no ordering guarantee at all (the point of the barrier above), if you're looking at the return values of atomic_inc/dec you don't need a barrier because regardless of which order the CPUs go in, they'll see distinct values (we use this for reference counting). James