On Fri, Jul 17, 2020 at 06:02:47PM -0700, Eric Biggers wrote: > On Fri, Jul 17, 2020 at 01:51:38PM -0400, Alan Stern wrote: > > On Fri, Jul 17, 2020 at 06:47:50PM +0100, Matthew Wilcox wrote: > > > On Thu, Jul 16, 2020 at 09:44:27PM -0700, Eric Biggers wrote: > > ... > > > > + /* on success, pairs with smp_load_acquire() above and below */ > > > > + if (cmpxchg_release(&foo, NULL, p) != NULL) { > > > > > > Why do we have cmpxchg_release() anyway? Under what circumstances is > > > cmpxchg() useful _without_ having release semantics? > > > > To answer just the last question: cmpxchg() is useful for lock > > acquisition, in which case it needs to have acquire semantics rather > > than release semantics. > > > > To clarify, there are 4 versions of cmpxchg: > > cmpxchg(): does ACQUIRE and RELEASE (on success) > cmpxchg_acquire(): does ACQUIRE only (on success) > cmpxchg_release(): does RELEASE only (on success) > cmpxchg_relaxed(): no barriers > > The problem here is that here we need RELEASE on success and ACQUIRE on failure. > But no version guarantees any barrier on failure. Why not? Do CPU designers not do load-linked-with-acquire-semantics? Or is it our fault for not using the appropriate instruction? > So as far as I can tell, the best we can do is use cmpxchg_release() (or > cmpxchg() which would be stronger but unnecessary), followed by a separate > ACQUIRE on failure. OK, but that detail needs to be hidden behind a higher level primitive, not exposed to device driver writers.