Hi-- On 8/7/20 10:07 AM, Joel Fernandes (Google) wrote: > RCU's hotplug design will help understand the requirements an RCU > implementation needs to fullfill, such as dead-lock avoidance. > > The rcu_barrier() section of the "Hotplug CPU" section already talks > about deadlocks, however the description of what else can deadlock other > than rcu_barrier is rather incomplete. > > This commit therefore continues the section by describing how RCU's > design handles CPU hotplug in a deadlock-free way. > > Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> > --- > .../RCU/Design/Requirements/Requirements.rst | 22 +++++++++++++++++++ > 1 file changed, 22 insertions(+) > > diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst > index 16c64a2eff93..0a4148b9f743 100644 > --- a/Documentation/RCU/Design/Requirements/Requirements.rst > +++ b/Documentation/RCU/Design/Requirements/Requirements.rst > @@ -1940,6 +1940,28 @@ deadlock. Furthermore, ``rcu_barrier()`` blocks CPU-hotplug operations > during its execution, which results in another type of deadlock when > invoked from a CPU-hotplug notifier. > > +Also, RCU's implementation avoids serious deadlocks which could occur due to > +interaction between hotplug, timers and grace period processing. It does so by > +maintaining its own bookkeeping of every CPU's hotplug state, independent of > +the various CPU masks and by reporting quiescent states at explicit points. It > +may come across as a surprise, but the force quiescent state loop (FQS) does > +not report quiescent states for offline CPUs and is not required to. > + > +For an offline CPU, the quiescent state will be reported in either of: > +1. During CPU offlining, using RCU's hotplug notifier (``rcu_report_dead()``). note, uses (), which is good: () > +2. During grace period initialization (``rcu_gp_init``) if it detected a race add for consistency & readability: rcu_gp_init() > + with CPU offlining, or a race with a task unblocking on a node which > + previously had all of its CPUs offlined. > + > +The CPU onlining path (``rcu_cpu_starting``) does not need to a report ditto: rcu_cpu_starting() > +quiescent state for an offline CPU in fact it would trigger a warning if a Missing something; maybe like so: for an offline CPU; in fact > +quiescent state was not already reported for that CPU. > + > +During the checking/modification of RCU's hotplug bookkeeping, the > +corresponding CPU's leaf node lock is held. This avoids race conditions between > +RCU's hotplug notifier hooks, grace period initialization code and the FQS loop > +which can concurrently refer to or modify the bookkeeping. > + > Scheduler and RCU > ~~~~~~~~~~~~~~~~~ > > cheers. -- ~Randy