On Mon, Jan 25, 2021 at 07:49:04AM -0800, Dave Hansen wrote: > Haitao managed to create another splat over the weekend. It was, indeed: > > > WARNING: CPU: 3 PID: 7620 at kernel/rcu/srcutree.c:374 cleanup_srcu_struct+0xed/0x100 > > which is: > > > if (WARN_ON(!srcu_get_delay(ssp))) > > return; /* Just leak it! */ > > That check means that there is an outstanding "expedited" grace period. > The fact that it's expedited is not important. This: > > https://lwn.net/Articles/202847/ > > describes the reasoning behind the warning: > > If the struct srcu_struct is dynamically allocated, then > cleanup_srcu_struct() must be called before it is freed ... the > caller must take care to ensure that all SRCU read-side critical > sections have completed (and that no more will commence) before > calling cleanup_srcu_struct(). > > synchronize_srcu() will (obviously) wait for the grace period to > complete. Calling it will shut up the warning for sure, most of the time. > > The required sequence of events is in here: > > > https://lore.kernel.org/lkml/1492472726-3841-4-git-send-email-paulmck@xxxxxxxxxxxxxxxxxx/ I'll add "Link:" for this to the final fix. It's a good reference. > I suspect that the mmu notifier's synchronize_srcu() is run in parallel > very close to when cleanup_srcu_struct() is called. This violates the > "prevent any further calls to synchronize_srcu" rule. > > So, while I suspect that adding a synchronize_srcu() is *part* of the > correct solution, I'm still not convinced that the > sgx_mmu_notifier_release() code is correct. What Sean suggested in private discussion, i.e. using kref_get() in the MMU notifier AFAIK should be enough to do the necessary serialization. /Jarkko