Re: slow sync rcu_tasks_trace

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 09, 2020 at 05:10:53PM +0200, Jiri Olsa wrote:
> On Wed, Sep 09, 2020 at 04:38:58AM -0700, Paul E. McKenney wrote:
> > On Tue, Sep 08, 2020 at 07:34:20PM -0700, Alexei Starovoitov wrote:
> > > Hi Paul,
> > > 
> > > Looks like sync rcu_tasks_trace got slower or we simply didn't notice
> > > it earlier.
> > > 
> > > In selftests/bpf try:
> > > time ./test_progs -t trampoline_count
> > > #101 trampoline_count:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > 
> > > real    1m17.082s
> > > user    0m0.145s
> > > sys    0m1.369s
> > > 
> > > But with the following hack:
> > > diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> > > index 7dd523a7e32d..c417b817ec5d 100644
> > > --- a/kernel/bpf/trampoline.c
> > > +++ b/kernel/bpf/trampoline.c
> > > @@ -217,7 +217,7 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
> > >          * programs finish executing.
> > >          * Wait for these two grace periods together.
> > >          */
> > > -       synchronize_rcu_mult(call_rcu_tasks, call_rcu_tasks_trace);
> > > +//     synchronize_rcu_mult(call_rcu_tasks, call_rcu_tasks_trace);
> > > 
> > > I see:
> > > time ./test_progs -t trampoline_count
> > > #101 trampoline_count:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > 
> > > real    0m1.588s
> > > user    0m0.131s
> > > sys    0m1.342s
> > > 
> > > It takes an extra minute to do 40 sync rcu_tasks_trace calls.
> > > It means that every sync takes more than a second.
> > > That feels excessive.
> > > 
> > > Doing:
> > > -       synchronize_rcu_mult(call_rcu_tasks, call_rcu_tasks_trace);
> > > +       synchronize_rcu();
> > > is also fast:
> > > time ./test_progs -t trampoline_count
> > > #101 trampoline_count:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > 
> > > real    0m2.089s
> > > user    0m0.139s
> > > sys    0m1.282s
> > > 
> > > sync rcu_tasks() is fast too:
> > > -       synchronize_rcu_mult(call_rcu_tasks, call_rcu_tasks_trace);
> > > +       synchronize_rcu_tasks();
> > > time ./test_progs -t trampoline_count
> > > #101 trampoline_count:OK
> > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
> > > 
> > > real    0m2.209s
> > > user    0m0.117s
> > > sys    0m1.344s
> > > 
> > > so it's really something going on with sync rcu_tasks_trace.
> > > Could you please take a look?
> > 
> > I am guessing that your .config has CONFIG_TASKS_TRACE_RCU_READ_MB=n.
> > If I am wrong, please try CONFIG_TASKS_TRACE_RCU_READ_MB=y.
> 
> hi,
> I noticed the slowdown as well, and adding CONFIG_TASKS_TRACE_RCU_READ_MB=y
> speeds it up for me

Thank you for testing this!  This will most likley also degrade read-side
performance beyond what is reasonable.  So could you please also try
the kernel boot parameter called out below?

Nevertheless, the fact that this fixes things does mean that a solution
exists.  Now to close in on it.  ;-)

(For example, it might be necessary to provide per-flavor tasklist
scan backoffs and/or it might be necessary to adjust the default for
rcupdate.rcu_task_ipi_delay=50.)

							Thanx, Paul

> thanks,
> jirka
> 
> > 
> > Otherwise (or alternatively), could you please try booting with
> > rcupdate.rcu_task_ipi_delay=50?  The default value is 500, or half a
> > second on a HZ=1000 system, which on a busy system could easily result
> > in the grace-period delays that you are seeing.  The value of this
> > kernel boot parameter does interact with the tasklist-scan backoffs,
> > so its effect will not likely be linear.
> > 
> > Do either of those approaches help?
> > 
> > 							Thanx, Paul
> > 
> 



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux