On Wed, Sep 09, 2020 at 04:38:58AM -0700, Paul E. McKenney wrote: > On Tue, Sep 08, 2020 at 07:34:20PM -0700, Alexei Starovoitov wrote: > > Hi Paul, > > > > Looks like sync rcu_tasks_trace got slower or we simply didn't notice > > it earlier. > > > > In selftests/bpf try: > > time ./test_progs -t trampoline_count > > #101 trampoline_count:OK > > Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED > > > > real 1m17.082s > > user 0m0.145s > > sys 0m1.369s > > > > so it's really something going on with sync rcu_tasks_trace. > > Could you please take a look? > > I am guessing that your .config has CONFIG_TASKS_TRACE_RCU_READ_MB=n. > If I am wrong, please try CONFIG_TASKS_TRACE_RCU_READ_MB=y. I've added CONFIG_RCU_EXPERT=y CONFIG_TASKS_TRACE_RCU_READ_MB=y and it helped: time ./test_progs -t trampoline_count #101 trampoline_count:OK Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED real 0m8.924s user 0m0.138s sys 0m1.408s But this is still bad. It's 4 times slower vs rcu_tasks and isn't really usable for bpf, since it adds memory barriers exactly where we need them removed. In the default configuration rcu_tasks_trace is 40! times slower than rcu_tasks. This huge difference in sync times concerns me a lot. If bpf has to use memory barriers in rcu_read_lock_trace and still be 4 times slower than rcu_tasks in the best case then there is no much point in rcu_tasks_trace. Converting everything to srcu would be better, but I really hope you can find a solution to this tasks_trace issue. > Otherwise (or alternatively), could you please try booting with > rcupdate.rcu_task_ipi_delay=50? The default value is 500, or half a > second on a HZ=1000 system, which on a busy system could easily result > in the grace-period delays that you are seeing. The value of this > kernel boot parameter does interact with the tasklist-scan backoffs, > so its effect will not likely be linear. The tests were run on freshly booted VM with 4 cpus. The VM is idle. The host is idle too. Adding rcupdate.rcu_task_ipi_delay=50 boot param sort-of helped: time ./test_progs -t trampoline_count #101 trampoline_count:OK Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED real 0m25.890s user 0m0.124s sys 0m1.507s It is still awful. >From "perf report" there is little time spend in the kernel. The kernel is waiting on something. I thought in theory the rcu_tasks_trace should have been faster on update side vs rcu_tasks ? Could it be a bug somewhere and some missing wakeup? It doesn't feel that it works as intended. Whatever it is please try to reproduce it to remove me as a middle man.