On Mon, Apr 08, 2019 at 09:05:34AM -0400, Mathieu Desnoyers wrote: > ----- On Apr 7, 2019, at 10:27 PM, paulmck paulmck@xxxxxxxxxxxxx wrote: > > > On Sun, Apr 07, 2019 at 09:07:18PM +0000, Joel Fernandes wrote: > >> On Sun, Apr 07, 2019 at 04:41:36PM -0400, Mathieu Desnoyers wrote: > >> > > >> > ----- On Apr 7, 2019, at 3:32 PM, Joel Fernandes, Google joel@xxxxxxxxxxxxxxxxx > >> > wrote: > >> > > >> > > On Sun, Apr 07, 2019 at 03:26:16PM -0400, Mathieu Desnoyers wrote: > >> > >> ----- On Apr 7, 2019, at 9:59 AM, paulmck paulmck@xxxxxxxxxxxxx wrote: > >> > >> > >> > >> > On Sun, Apr 07, 2019 at 06:39:41AM -0700, Paul E. McKenney wrote: > >> > >> >> On Sat, Apr 06, 2019 at 07:06:13PM -0400, Joel Fernandes wrote: > >> > >> > > >> > >> > [ . . . ] > >> > >> > > >> > >> >> > > diff --git a/include/asm-generic/vmlinux.lds.h > >> > >> >> > > b/include/asm-generic/vmlinux.lds.h > >> > >> >> > > index f8f6f04c4453..c2d919a1566e 100644 > >> > >> >> > > --- a/include/asm-generic/vmlinux.lds.h > >> > >> >> > > +++ b/include/asm-generic/vmlinux.lds.h > >> > >> >> > > @@ -338,6 +338,10 @@ > >> > >> >> > > KEEP(*(__tracepoints_ptrs)) /* Tracepoints: pointer array */ \ > >> > >> >> > > __stop___tracepoints_ptrs = .; \ > >> > >> >> > > *(__tracepoints_strings)/* Tracepoints: strings */ \ > >> > >> >> > > + . = ALIGN(8); \ > >> > >> >> > > + __start___srcu_struct = .; \ > >> > >> >> > > + *(___srcu_struct_ptrs) \ > >> > >> >> > > + __end___srcu_struct = .; \ > >> > >> >> > > } \ > >> > >> >> > > >> > >> >> > This vmlinux linker modification is not needed. I tested without it and srcu > >> > >> >> > torture works fine with rcutorture built as a module. Putting further prints > >> > >> >> > in kernel/module.c verified that the kernel is able to find the srcu structs > >> > >> >> > just fine. You could squash the below patch into this one or apply it on top > >> > >> >> > of the dev branch. > >> > >> >> > >> > >> >> Good point, given that otherwise FORTRAN named common blocks would not > >> > >> >> work. > >> > >> >> > >> > >> >> But isn't one advantage of leaving that stuff in the RO_DATA_SECTION() > >> > >> >> macro that it can be mapped read-only? Or am I suffering from excessive > >> > >> >> optimism? > >> > >> > > >> > >> > And to answer the other question, in the case where I am suffering from > >> > >> > excessive optimism, it should be a separate commit. Please see below > >> > >> > for the updated original commit thus far. > >> > >> > > >> > >> > And may I have your Tested-by? > >> > >> > >> > >> Just to confirm: does the cleanup performed in the modules going > >> > >> notifier end up acting as a barrier first before freeing the memory ? > >> > >> If not, is it explicitly stated that a barrier must be issued before > >> > >> module unload ? > >> > >> > >> > > > >> > > You mean rcu_barrier? It is mentioned in the documentation that this is the > >> > > responsibility of the module writer to prevent delays for all modules. > >> > > >> > It's a srcu barrier yes. Considering it would be a barrier specific to the > >> > srcu domain within that module, I don't see how it would cause delays for > >> > "all" modules if we implicitly issue the barrier on module unload. What > >> > am I missing ? > >> > >> Yes you are right. I thought of this after I just sent my email. I think it > >> makes sense for srcu case to do and could avoid a class of bugs. > > > > If there are call_srcu() callbacks outstanding, the module writer still > > needs the srcu_barrier() because otherwise callbacks arrive after > > the module text has gone, which will be disappoint the CPU when it > > tries fetching instructions that are no longer mapped. If there are > > no call_srcu() callbacks from that module, then there is no need for > > srcu_barrier() either way. > > > > So if an srcu_barrier() is needed, the module developer needs to > > supply it. > > When you say "callbacks arrive after the module text has gone", > I think you assume that free_module() is invoked before the > MODULE_STATE_GOING notifiers are called. But it's done in the > opposite order: going notifiers are called first, and then > free_module() is invoked. > > So AFAIU it would be safe to issue the srcu_barrier() from the module > going notifier. > > Or am I missing something ? We do seem to be talking past each other. ;-) This has nothing to do with the order of events at module-unload time. So please let me try again. If a given srcu_struct in a module never has call_srcu() invoked, there is no need to invoke rcu_barrier() at any time, whether at module-unload time or not. Adding rcu_barrier() in this case adds overhead and latency for no good reason. If a given srcu_struct in a module does have at least one call_srcu() invoked, it is already that module's responsibility to make sure that the code sticks around long enough for the callback to be invoked. This means that correct SRCU users that invoke call_srcu() already have srcu_barrier() at module-unload time. Incorrect SRCU users, with reasonable probability, now get a WARN_ON() at module-unload time, with the per-CPU state getting leaked. Before this change, they would (also with reasonable probability) instead get an instruction-fetch fault when the SRCU callback was invoked after the completion of the module unload. Furthermore, in all cases where they would previously have gotten the instruction-fetch fault, they now get the WARN_ON(), like this: if (WARN_ON(rcu_segcblist_n_cbs(&sdp->srcu_cblist))) return; /* Forgot srcu_barrier(), so just leak it! */ So this change already represents an improvement in usability. Thanx, Paul _______________________________________________ amd-gfx mailing list amd-gfx@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/amd-gfx