On Fri, Jul 26, 2019 at 09:51:35AM -0300, Mauro Carvalho Chehab wrote: [snip] > +| until the assignment to ``gp``, by which time both fields are fully | > +| initialized. So reordering the assignments to ``p->a`` and ``p->b`` | > +| cannot possibly cause any problems. | > ++-----------------------------------------------------------------------+ > + > +It is tempting to assume that the reader need not do anything special to > +control its accesses to the RCU-protected data, as shown in > +``do_something_gp_buggy()`` below: > + > + :: > + > + 1 bool do_something_gp_buggy(void) > + 2 { > + 3 rcu_read_lock(); > + 4 p = gp; /* OPTIMIZATIONS GALORE!!! */ > + 5 if (p) { > + 6 do_something(p->a, p->b); > + 7 rcu_read_unlock(); > + 8 return true; > + 9 } > + 10 rcu_read_unlock(); > + 11 return false; > + 12 } > + > +However, this temptation must be resisted because there are a > +surprisingly large number of ways that the compiler (to say nothing of > +`DEC Alpha CPUs <https://h71000.www7.hp.com/wizard/wiz_2637.html>`__) > +can trip this code up. For but one example, if the compiler were short > +of registers, it might choose to refetch from ``gp`` rather than keeping > +a separate copy in ``p`` as follows: > + > + :: > + > + 1 bool do_something_gp_buggy_optimized(void) > + 2 { > + 3 rcu_read_lock(); > + 4 if (gp) { /* OPTIMIZATIONS GALORE!!! */ > + 5 do_something(gp->a, gp->b); > + 6 rcu_read_unlock(); > + 7 return true; > + 8 } > + 9 rcu_read_unlock(); > + 10 return false; > + 11 } > + > +If this function ran concurrently with a series of updates that replaced > +the current structure with a new one, the fetches of ``gp->a`` and > +``gp->b`` might well come from two different structures, which could > +cause serious confusion. To prevent this (and much else besides), > +``do_something_gp()`` uses ``rcu_dereference()`` to fetch from ``gp``: > + > + :: > + > + 1 bool do_something_gp(void) > + 2 { > + 3 rcu_read_lock(); > + 4 p = rcu_dereference(gp); > + 5 if (p) { > + 6 do_something(p->a, p->b); > + 7 rcu_read_unlock(); > + 8 return true; > + 9 } > + 10 rcu_read_unlock(); > + 11 return false; > + 12 } > + > +The ``rcu_dereference()`` uses volatile casts and (for DEC Alpha) memory > +barriers in the Linux kernel. Should a `high-quality implementation of > +C11 ``memory_order_consume`` > +[PDF] <http://www.rdrop.com/users/paulmck/RCU/consume.2015.07.13a.pdf>`__ > +ever appear, then ``rcu_dereference()`` could be implemented as a > +``memory_order_consume`` load. Regardless of the exact implementation, a > +pointer fetched by ``rcu_dereference()`` may not be used outside of the > +outermost RCU read-side critical section containing that > +``rcu_dereference()``, unless protection of the corresponding data > +element has been passed from RCU to some other synchronization > +mechanism, most commonly locking or `reference > +counting <https://www.kernel.org/doc/Documentation/RCU/rcuref.txt>`__. >From the make htmldocs output, this appears very poorly for me, I get something like this in the browser: The rcu_dereference() uses volatile casts and (for DEC Alpha) memory barriers in the Linux kernel. Should a high-quality implementation of C11 ``memory_order_consume` [PDF] <http://www.rdrop.com/users/paulmck/RCU/consume.2015.07.13a.pdf>`__ ever appear, then rcu_dereference() could be implemented as a memory_order_consume load. Is there a syntax issue here? One more feedback, the image under "RCU read-side critical section that started before the current grace period:" should probably be blown up a bit.