On Fri, Jul 17, 2020 at 05:58:57PM -0700, Eric Biggers wrote: > On Fri, Jul 17, 2020 at 01:53:40PM -0700, Darrick J. Wong wrote: > > > +There are also cases in which the smp_load_acquire() can be replaced by > > > +the more lightweight READ_ONCE(). (smp_store_release() is still > > > +required.) Specifically, if all initialized memory is transitively > > > +reachable from the pointer itself, then there is no control dependency > > > > I don't quite understand what "transitively reachable from the pointer > > itself" means? Does that describe the situation where all the objects > > reachable through the object that the global struct foo pointer points > > at are /only/ reachable via that global pointer? > > > > The intent is that "transitively reachable" means that all initialized memory > can be reached by dereferencing the pointer in some way, e.g. p->a->b[5]->c. > > It could also be the case that allocating the object initializes some global or > static data, which isn't reachable in that way. Access to that data would then > be a control dependency, which a data dependency barrier wouldn't work for. > > It's possible I misunderstood something. (Note the next paragraph does say that > using READ_ONCE() is discouraged, exactly for this reason -- it can be hard to > tell whether it's correct.) Suggestions of what to write here are appreciated. Perhaps something like this: Specifically, if the only way to reach the initialized memory involves dereferencing the pointer itself then READ_ONCE() is sufficient. This is because there will be an address dependency between reading the pointer and accessing the memory, which will ensure proper ordering. But if some of the initialized memory is reachable some other way (for example, if it is global or static data) then there need not be an address dependency, merely a control dependency (checking whether the pointer is non-NULL). Control dependencies do not always ensure ordering -- certainly not for reads, and depending on the compiler, possibly not for some writes -- and therefore a load-acquire is necessary. Perhaps this is more wordy than you want, but it does get the important ideas across. Alan Stern