On Fri, Jul 17, 2020 at 09:25:55PM -0400, Alan Stern wrote: > On Fri, Jul 17, 2020 at 05:58:57PM -0700, Eric Biggers wrote: > > On Fri, Jul 17, 2020 at 01:53:40PM -0700, Darrick J. Wong wrote: > > > > +There are also cases in which the smp_load_acquire() can be replaced by > > > > +the more lightweight READ_ONCE(). (smp_store_release() is still > > > > +required.) Specifically, if all initialized memory is transitively > > > > +reachable from the pointer itself, then there is no control dependency > > > > > > I don't quite understand what "transitively reachable from the pointer > > > itself" means? Does that describe the situation where all the objects > > > reachable through the object that the global struct foo pointer points > > > at are /only/ reachable via that global pointer? > > > > > > > The intent is that "transitively reachable" means that all initialized memory > > can be reached by dereferencing the pointer in some way, e.g. p->a->b[5]->c. > > > > It could also be the case that allocating the object initializes some global or > > static data, which isn't reachable in that way. Access to that data would then > > be a control dependency, which a data dependency barrier wouldn't work for. > > > > It's possible I misunderstood something. (Note the next paragraph does say that > > using READ_ONCE() is discouraged, exactly for this reason -- it can be hard to > > tell whether it's correct.) Suggestions of what to write here are appreciated. > > Perhaps something like this: > > Specifically, if the only way to reach the initialized memory > involves dereferencing the pointer itself then READ_ONCE() is > sufficient. This is because there will be an address dependency > between reading the pointer and accessing the memory, which will > ensure proper ordering. But if some of the initialized memory > is reachable some other way (for example, if it is global or > static data) then there need not be an address dependency, > merely a control dependency (checking whether the pointer is > non-NULL). Control dependencies do not always ensure ordering > -- certainly not for reads, and depending on the compiler, > possibly not for some writes -- and therefore a load-acquire is > necessary. > > Perhaps this is more wordy than you want, but it does get the important > ideas across. I don't think we should worry about wordsmithing this. We should just say "Use the init_pointer_once API" and then people who want to worry about optimising the implementation of that API never have to talk to the people who want to use that API.