Re: [PATCH] rcu: Delay the RCU-selftests during boot.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2022-03-02 20:36:50 [-0800], Paul E. McKenney wrote:
> > To simply move the test from rcu_init_tasks_generic() to after
> > do_pre_smp_initcalls(). If we can't move rcu_init_tasks_generic() after
> > do_pre_smp_initcalls() or at least the test part because we need working
> > synchronize_rcu() in early_initcall() then I need to move the RT
> > requirements before. Simple ;)
> 
> As long as RT confines itself to configurations that do not need a
> working synchronize_rcu() in the intervening code, yes, simple.  ;-)

;)

> > The requirements are:
> > 
> > --- a/init/main.c
> > +++ b/init/main.c
> > @@ -1598,6 +1601,9 @@ static noinline void __init kernel_init_freeable(void)
> >  
> >         init_mm_internals();
> >  
> > +       spawn_ksoftirqd();
> > +       irq_work_init_threads();
> > +
> >         rcu_init_tasks_generic();
> >         do_pre_smp_initcalls();
> >         lockup_detector_init();
> > 
> > spawn_ksoftirqd() is what I mentioned. What I just figured out is
> > irq_work_init_threads() due to
> > 	call_rcu_tasks_iw_wakeup()
> > 
> > I can't move this to hard-IRQ context because it invokes wake_up() which
> > acquires sleeping locks. If you say that rtp->cbs_wq has only one waiter
> > and something like rcuwait_wait_event() / rcuwait_wake_up() would work
> > as well then call_rcu_tasks_iw_wakeup() could be lifter to hard-IRQ
> > context and we need to worry only about spawn_ksoftirqd() :)
> 
> OK, I was expecting that the swait_event_timeout_exclusive() call from
> synchronize_rcu_expedited_wait_once() would be the problem.  Are you
> saying that this swait_event_timeout_exclusive() works fine?  

swait_event_timeout_exclusive() uses schedule_timeout() which uses a
timer_list timer and this one requires ksoftirqd to work.

> Or are you
> instead saying that the call_rcu_tasks_iw_wakeup() issues cause trouble
> before that swait_event_timeout_exclusive() gets a chance to cause its
> own trouble?

Both is needed:
- ksoftirqd thread for timer_list timers handling.
- irq_work thread for irq_work which is not done in hard-IRQ context.

What I observe during boot:
| [    0.184838] cblist_init_generic: Setting adjustable number of callback queues.
| [    0.184853] cblist_init_generic: Setting shift to 2 and lim to 1.
| [    0.188116] irq_work_single() wake_up_klogd_work_func+0x0/0x70 26
| [    0.188861] cblist_init_generic: Setting shift to 2 and lim to 1.
| [    0.189671] Running RCU-tasks wait API self tests
| [    0.190254] irq_work_single() call_rcu_tasks_iw_wakeup+0x0/0x20 22
| [    0.292569] expire_timers() process_timeout+0x0/0x10
| [    0.292655] irq_work_single() call_rcu_tasks_iw_wakeup+0x0/0x20 22
| [    0.295082] irq_work_single() call_rcu_tasks_iw_wakeup+0x0/0x20 22
| [    0.296481] irq_work_single() call_rcu_tasks_iw_wakeup+0x0/0x20 22
| [    0.304685] rcu: Hierarchical SRCU implementation.
| [    0.311415] Callback from call_rcu_tasks_trace() invoked.
| [    0.344100] smp: Bringing up secondary CPUs ...

1x schedule_timeout() and 4x call_rcu_tasks_iw_wakeup(). Both are
crucial.

> Either way, it sounds like that irq_work_queue(&rtpcp->rtp_irq_work) in
> call_rcu_tasks_generic() needs some adjustment to work in RT.  This should
> be doable.  Given this, and given that the corresponding diagnostic
> function rcu_tasks_verify_self_tests() is a late_initcall() function,
> you don't need to move the call to rcu_init_tasks_generic(), correct?

#1 ksoftirqd must be spawned first in order to get timer_list timer to
   work. I'm going to do that, this should not be a problem.

#2 call_rcu_tasks_iw_wakeup. Here we have the following options:
   - Don't use, delay, …

   - if you can guarantee that there is only _one_ waiter
     => Replace rcu_tasks::cbs_wq with rcuwait. Let this irq_work run
        then in hard-IRQ context.

   - if you can't guarantee that there is only _one_ waiter
     => spawn the irq-work thread early.

As for #2, I managed to trigger the wakeup via tracing (and stumbled
into a bug un related to this) and I see only one waiter. Doesn't mean I
do it right and they can't be a second waiter.

> Back over to you so that I can learn what I am still missing.  ;-)

Hope that helps.

> 							Thanx, Paul

Sebastian




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux