Re: SRCU: kworker hung in synchronize_srcu

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 2, 2023 at 4:10 AM Frederic Weisbecker <frederic@xxxxxxxxxx> wrote:
>
> Le Mon, Oct 02, 2023 at 12:32:41AM +0200, Frederic Weisbecker a écrit :
> > Le Sun, Oct 01, 2023 at 07:57:14AM +0530, Neeraj upadhyay a écrit :
> > >
> > > But "more" only checks for CBs in DONE tail. The callbacks which have been just
> > > accelerated are not advanced to DONE tail.
> > >
> > > Having said above, I am still trying to figure out, which callbacks
> > > are actually being pointed
> > > by NEXT tail. Given that __call_srcu() already does a advance and
> > > accelerate, all enqueued
> > > callbacks would be in either WAIT tail (the callbacks for current
> > > active GP) or NEXT_READY
> > > tail (the callbacks for next GP after current one completes). So, they
> > > should already have
> > > GP num assigned and srcu_invoke_callbacks() won't accelerate them.
> > > Only case I can
> > > think of is, if current GP completes after we sample
> > > rcu_seq_current(&ssp->srcu_gp_seq) for
> > > rcu_segcblist_advance() (so, WAIT tail cbs are not moved to DONE tail)
> > > and a new GP is started
> > > before we take snapshot ('s') of next GP  for
> > > rcu_segcblist_accelerate(), then the gp num 's'
> > > > gp num of NEXT_READY_TAIL and will be put in NEXT tail. Not sure
> > > if my understanding is correct here. Thoughts?
> > >
> > > __call_srcu()
> > >         rcu_segcblist_advance(&sdp->srcu_cblist,
> > >                               rcu_seq_current(&ssp->srcu_gp_seq));
> > >         s = rcu_seq_snap(&ssp->srcu_gp_seq);
> > >         (void)rcu_segcblist_accelerate(&sdp->srcu_cblist, s);
> >
> > Good point! This looks plausible.
> >
> > Does the (buggy) acceleration in srcu_invoke_callbacks() exists solely
> > to handle that case? Because then the acceleration could be moved before
> > the advance on callbacks handling, something like:
> >
> > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > index 20d7a238d675..af9d8af1d321 100644
> > --- a/kernel/rcu/srcutree.c
> > +++ b/kernel/rcu/srcutree.c
> > @@ -1245,6 +1245,11 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
> >       rcu_segcblist_advance(&sdp->srcu_cblist,
> >                             rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
> >       s = rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq);
> > +     /*
> > +      * Acceleration might fail if the preceding call to
> > +      * rcu_segcblist_advance() also failed due to a prior incomplete grace
> > +      * period. This should be later fixed in srcu_invoke_callbacks().
> > +      */
> >       (void)rcu_segcblist_accelerate(&sdp->srcu_cblist, s);
> >       if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) {
> >               sdp->srcu_gp_seq_needed = s;
> > @@ -1692,6 +1697,13 @@ static void srcu_invoke_callbacks(struct work_struct *work)
> >       ssp = sdp->ssp;
> >       rcu_cblist_init(&ready_cbs);
> >       spin_lock_irq_rcu_node(sdp);
> > +     /*
> > +      * Acceleration might have failed in srcu_gp_start_if_needed() if
> > +      * the preceding call to rcu_segcblist_advance() also failed due to
> > +      * a prior incomplete grace period.
> > +      */
> > +     (void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
> > +                                    sdp->srcu_gp_seq_needed);
> >       rcu_segcblist_advance(&sdp->srcu_cblist,
> >                             rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
> >       if (sdp->srcu_cblist_invoking ||
> > @@ -1720,8 +1732,6 @@ static void srcu_invoke_callbacks(struct work_struct *work)
> >        */
> >       spin_lock_irq_rcu_node(sdp);
> >       rcu_segcblist_add_len(&sdp->srcu_cblist, -len);
> > -     (void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
> > -                                    rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq));
> >       sdp->srcu_cblist_invoking = false;
> >       more = rcu_segcblist_ready_cbs(&sdp->srcu_cblist);
> >       spin_unlock_irq_rcu_node(sdp);
>
> And if this works, can we then remove srcu_invoke_callbacks() self-requeue?
> If queued several times before it actually fires, it will catch the latest
> grace period's end. And if queued while the callback runs, it will re-run.
>

This makes sense, but not sure for non-wq context which  link [1] mentions,
whether it needs it.

> Also why do we have sdp->srcu_invoke_callbacks ? Is that workqueue re-entrant?
>
I think you mean sdp->srcu_cblist_invoking ?

There was a prior discussion on this [1], where Paul mentions about
non-wq context.



Thanks
Neeraj

[1] https://lkml.org/lkml/2020/11/19/1065

> Thanks.




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux