On 8/30/22 12:57, Jason Gunthorpe wrote: > I don't like the use of smb_mb very much, I deliberately choose the > more modern language of release/acquire because it makes it a lot > clearer what barriers are doing.. > > So, if we dig into it, using what I said above, the atomic refcount is: > > gup_pte_range() > try_grab_folio() > try_get_folio() > folio_ref_try_add_rcu() > folio_ref_add_unless() > page_ref_add_unless() > atomic_add_unless() > > So that wants to be an acquire > > The pairing release is in the page table code that does the put_page, > it wants to be an atomic_dec_return() as a release. Thanks for making that a lot clearer, at least for me anyway! > > Now, we go and look at Documentation/atomic_t.txt to try to understand > what are the ordering semantics of the atomics we are using and become > dazed-confused like me: > > ORDERING (go read memory-barriers.txt first) > -------- > > - RMW operations that have a return value are fully ordered; > > - RMW operations that are conditional are unordered on FAILURE, > otherwise the above rules apply. > > Fully ordered primitives are ordered against everything prior and everything > subsequent. Therefore a fully ordered primitive is like having an smp_mb() > before and an smp_mb() after the primitive. > > So, I take that to mean that both atomic_add_unless() and > atomic_dec_return() are "fully ordered" and "fully ordered" is a super > set of acquire/release. > > Thus, we already have the necessary barriers integrated into the > atomic being used. As long as we continue to sort-of-accidentally use atomic_add_unless(), which returns a value, instead of atomic_add(), which does not. :) Likewise on the put_page() side: we are depending on the somewhat accidental (from the perspective of memory barriers) use of atomics that return values. Maybe it would be good to add a little note at each site, to that effect? > > The smb_mb_after_atomic stuff is to be used with atomics that don't > return values, there are some examples in the doc > > Jason thanks, -- John Hubbard NVIDIA