Re: SRCU: kworker hung in synchronize_srcu

Neeraj upadhyay <neeraj.iitr10@xxxxxxxxx> · Wed, 4 Oct 2023 00:16:13 +0530

On Tue, Oct 3, 2023 at 5:52 PM Frederic Weisbecker <frederic@xxxxxxxxxx> wrote:
>
> On Mon, Oct 02, 2023 at 11:09:39PM +0200, Frederic Weisbecker wrote:
> > > >         spin_unlock_rcu_node(sdp);  /* Interrupts remain disabled. */
> > > >         WRITE_ONCE(ssp->srcu_sup->srcu_gp_start, jiffies);
> > > >         WRITE_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay, 0);
> > > > @@ -1245,7 +1243,18 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
> > > >         rcu_segcblist_advance(&sdp->srcu_cblist,
> > > >                               rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
> > > >         s = rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq);
> > > > -       (void)rcu_segcblist_accelerate(&sdp->srcu_cblist, s);
> > > > +       /*
> > > > +        * Acceleration might fail if the preceding call to
> > > > +        * rcu_segcblist_advance() also failed due to a prior grace
> > > > +        * period seen incomplete before rcu_seq_snap(). If so then a new
> > > > +        * call to advance will see the completed grace period and fix
> > > > +        * the situation.
> > > > +        */
> > > > +       if (!rcu_segcblist_accelerate(&sdp->srcu_cblist, s)) {
> > >
> > > We can add below also? Here old and new are rcu_seq_current() values used in
> > > the 2 calls to rcu_segcblist_advance().
> > >
> > > WARN_ON_ONCE(!(rcu_seq_completed_gp(old, new) && rcu_seq_new_gp(old, new)));
> >
> > Very good point! "new" should be exactly one and a half grace period away from
> > "old", will add that.
> >
> > Cooking proper patches now.
>
> Actually this more simple fix below. rcu_seq_snap() can be called before
> rcu_segcblist_advance() after all. The only side effect is that callbacks
> advancing is then _after_ the full barrier in rcu_seq_snap(). I don't see
> an obvious problem with that as that barrier only cares about:
>
> 1) Ordering accesses of the update side before call_srcu() so they don't bleed
> 2) See all the accesses prior to the grace period of the current gp_num
>
> The only things callbacks advancing need to be ordered against are carried by
> snp locking.
>

Nice! Your analysis looks good to me!

> I still remove the accelerations elsewhere and advancing in srcu_gp_start() in
> further patches. I'll also avoid advancing and acceleration in
> srcu_gp_start_if_needed if there is no callback to queue.
>
> The point is also that this simple fix alone can be easily backported and
> the rest can come as cleanups.
>

Sounds good!

>
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index 5602042856b1..8b09fb37dbf3 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -1244,10 +1244,10 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
>         spin_lock_irqsave_sdp_contention(sdp, &flags);
>         if (rhp)
>                 rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
> +       s = rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq);

We might want to add a comment here, why the specific ordering of the two
srcu_gp_seq reads is required here.

Thanks
Neeraj

>         rcu_segcblist_advance(&sdp->srcu_cblist,
>                               rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
> -       s = rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq);
> -       (void)rcu_segcblist_accelerate(&sdp->srcu_cblist, s);
> +       WARN_ON_ONCE(!rcu_segcblist_accelerate(&sdp->srcu_cblist, s) && rhp);
>         if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) {
>                 sdp->srcu_gp_seq_needed = s;
>                 needgp = true;