On Mon, Oct 15, 2018 at 09:05:22AM +0300, Nikolay Borisov wrote: > On 15.10.2018 05:47, Joel Fernandes wrote: > > On Sun, Oct 14, 2018 at 07:33:28PM -0700, Paul E. McKenney wrote: > >> On Sun, Oct 14, 2018 at 07:13:49PM -0700, Joel Fernandes wrote: > >>> On Sun, Oct 14, 2018 at 07:08:27PM -0700, Joel Fernandes wrote: > >>>> On Sun, Oct 14, 2018 at 04:17:31PM -0700, Paul E. McKenney wrote: > >>>>> On Sun, Oct 14, 2018 at 02:29:55PM -0700, Joel Fernandes (Google) wrote: > >>>>>> The Requirements.html document says "Disabling Preemption Does Not Block > >>>>>> Grace Periods". However this is no longer true with the RCU > >>>>>> consolidation. Lets remove the obsolete (non-)requirement entirely. > >>>>>> > >>>>>> Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> > >>>>> > >>>>> Good catch, queued, thank you! > >>>> > >>>> Thanks! By the way after I sent the patch, I also tried Oleg's experiment to > >>>> confirm that this is indeed obsolete. :) > >>>> > >>>> One thing interesting came up when I tried synchronize_rcu_expedited() > >>>> instead of synchronize_rcu() in Oleg's experiment, I still saw a multiple > >>>> millisecond delay between when the rcu read section completely and the > >>>> synchronize_rcu_expedited returns: > >>>> > >>>> For example, with synchronize_rcu_expedited, the 'SPIN done' and the 'SYNC > >>>> done' are about 3 millisecond apart: > >>>> [ 77.599142] SPIN start > >>>> [ 77.601595] SYNC start > >>>> [ 82.604950] SPIN done! > >>>> [ 82.607836] SYNC done! > >>>> I saw anywhere from 2-6 milliseconds. > >>>> > >>>> The reason I bring this up is according to Requirements.html: In some cases, > >>>> the multi-millisecond synchronize_rcu() latencies are unacceptable. In these > >>>> cases, synchronize_rcu_expedited() may be used instead,.. so either I messed > >>>> something up in the experiment, or I need to update this part of the document ;-) > >> > >> In normal testing, 2-6 milliseconds is indeed excessive. Could you please > >> point me at Oleg's experiment? Also, what CONFIG_PREEMPT setting were > >> you using? (My guess is CONFIG_PREEMPT=y.) > > > > The CONFIG_PREEMPT config I am using is CONFIG_PREEMPT=y. > > > >>> So I realized I'm running in Qemu so it could also be a scheduling delay of > >>> the vcpu thread. So apologies about the noise if the experiment works fine > >>> for you. > >> > >> I used rcuperf, which might not be doing the same thing as Oleg's > >> experiment. > > > > The experiment is mentioned at: > > https://www.mail-archive.com/linux-kernel@xxxxxxxxxxxxxxx/msg912055.html > > > > If you apply the below diff, it applies cleanly on rcu/dev. And then run: > > taskset 2 perl -e 'syscall 157, 666, 5000' & > > taskset 1 perl -e 'syscall 157, 777' > > > > diff --git a/kernel/sys.c b/kernel/sys.c > > index cf5c67533ff1..b654b7566ca3 100644 > > --- a/kernel/sys.c > > +++ b/kernel/sys.c > > @@ -2261,6 +2261,9 @@ int __weak arch_prctl_spec_ctrl_set(struct task_struct *t, unsigned long which, > > return -EINVAL; > > } > > > > +#include <linux/delay.h> > > + > > + > > SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, > > unsigned long, arg4, unsigned long, arg5) > > { > > @@ -2274,6 +2277,19 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3, > > > > error = 0; > > switch (option) { > > + case 666: > > + preempt_disable(); > > + pr_crit("SPIN start\n"); > > + while (arg2--) > > + mdelay(1); OK, this is the problem. When you spin in the kernel for several milliseconds with preemption disabled, the consolidated grace period is -required- to wait for this preemption-disabled reader to complete, whether expedited or not. So, expected behavior. ;-) In any case, please don't spin for milliseconds with preemption disabled. The real-time guys are unlikely to be happy with you if you do this! > > + pr_crit("SPIN done!\n"); > > + preempt_enable(); > > + break; > > + case 777: > > + pr_crit("SYNC start\n"); > > + synchronize_rcu(); > > + pr_crit("SYNC done!\n"); > > But you are using the console printing infrastructure which is rather > heavyweight. Try replacing pr_* calls with trace_printk so that you > write to the lock-free ring buffer, this will reduce the noise from the > heavy console printing infrastructure. And this might be a problem as well. Thanx, Paul > > + break; > > case PR_SET_PDEATHSIG: > > if (!valid_signal(arg2)) { > > error = -EINVAL; > > >