> On Jan 22, 2020, at 12:07 AM, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: > > On Tue, Jan 21, 2020 at 11:16:06PM -0500, Qian Cai wrote: >> >> >>> On Jan 21, 2020, at 3:46 PM, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: >>> >>> On Tue, Jan 21, 2020 at 02:09:05PM -0500, Qian Cai wrote: >>>>> On Jan 21, 2020, at 11:15 AM, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: >>>>> On Tue, Jan 21, 2020 at 09:37:13AM -0500, Qian Cai wrote: >>>>>>> On Jan 21, 2020, at 9:19 AM, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote: >>>>>>> >>>>>>> One approach would be to boot with rcupdate.rcu_cpu_stall_timeout=300, >>>>>>> which would allow more time. >>>>>> >>>>>> It works for me if once that warning triggered, give a bit information about adjusting the parameter when debugging options are on to suppress the warning due to expected long boot. >>>>> >>>>> Indeed. 300 seconds as shown above is currently the maximum, but >>>>> please let me know if it needs to be increased. This module parameter >>>>> is writable after boot via sysfs, so maybe that could be part of the >>>>> workaround. >>>>> >>>>>>> Longer term, I could suppress this warning during boot when >>>>>>> CONFIG_EFI_PGT_DUMP=y, but that sounds quite specific. Alternatively, >>>>>>> I could provide a Kconfig option that suppressed this during boot >>>>>>> that was selected by whatever long-running boot-time Kconfig option >>>>>>> needed it. Yet another approach would be for long-running operations >>>>>>> like efi_dump_pagetable() to suppress stalls on entry and re-enable them >>>>>>> upon exit. >>>>>>> >>>>>>> Thoughts? >>>>>> >>>>>> None of the options sounds particularly better for me because there could come up with other options may trigger this, memtest comes in mind, for example. Then, it is a bit of pain to maintain of unknown. >>>>> >>>>> I was afraid of that. ;-) >>>>> >>>>> Could you please send me the full dmesg up to that point? No promises, >>>>> but it might well be that I can make some broad-spectrum adjustment >>>>> within RCU. Only one way to find out… >>>> >>>> https://cailca.github.io/files/dmesg.txt >>> >>> Interesting. >>> >>> Does the following (very lightly tested) patch help? >> >> Yes, it works fine. > > Very good, thank you! May I apply your Tested-by? Just one thing minor, Applying: Boot warning at rcu_check_gp_start_stall() .git/rebase-apply/patch:66: space before tab in indent. READ_ONCE(rnp_root->gp_seq_needed)) || warning: 1 line adds whitespace errors. Otherwise, Tested-by: Qian Cai <cai@xxxxxx> > > Thanx, Paul > >>> ------------------------------------------------------------------------ >>> >>> commit fb21277f8f1c5cc40a8d41da2db4b0c499459821 >>> Author: Paul E. McKenney <paulmck@xxxxxxxxxx> >>> Date: Tue Jan 21 12:30:22 2020 -0800 >>> >>> rcu: Don't flag non-starting GPs before GP kthread is running >>> >>> Currently rcu_check_gp_start_stall() complains if a grace period takes >>> too long to start, where "too long" is roughly one RCU CPU stall-warning >>> interval. This has worked well, but there are some debugging Kconfig >>> options (such as CONFIG_EFI_PGT_DUMP=y) that can make booting take a >>> very long time, so much so that the stall-warning interval has expired >>> before RCU's grace-period kthread has even been spawned. >>> >>> This commit therefore resets the rcu_state.gp_req_activity and >>> rcu_state.gp_activity timestamps just before the grace-period kthread >>> is spawned, and modifies the checks and adds ordering to ensure that >>> if rcu_check_gp_start_stall() sees that the grace-period kthread >>> has been spawned, that it will also see the resets applied to the >>> rcu_state.gp_req_activity and rcu_state.gp_activity timestamps. >>> >>> Reported-by: Qian Cai <cai@xxxxxx> >>> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx> >>> >>> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c >>> index 04718bc..d9d619d 100644 >>> --- a/kernel/rcu/tree.c >>> +++ b/kernel/rcu/tree.c >>> @@ -1209,7 +1209,7 @@ static bool rcu_start_this_gp(struct rcu_node *rnp_start, struct rcu_data *rdp, >>> trace_rcu_this_gp(rnp, rdp, gp_seq_req, TPS("Startedroot")); >>> WRITE_ONCE(rcu_state.gp_flags, rcu_state.gp_flags | RCU_GP_FLAG_INIT); >>> WRITE_ONCE(rcu_state.gp_req_activity, jiffies); >>> - if (!rcu_state.gp_kthread) { >>> + if (!READ_ONCE(rcu_state.gp_kthread)) { >>> trace_rcu_this_gp(rnp, rdp, gp_seq_req, TPS("NoGPkthread")); >>> goto unlock_out; >>> } >>> @@ -1259,10 +1259,10 @@ static bool rcu_future_gp_cleanup(struct rcu_node *rnp) >>> */ >>> static void rcu_gp_kthread_wake(void) >>> { >>> - if ((current == rcu_state.gp_kthread && >>> + if ((current == READ_ONCE(rcu_state.gp_kthread) && >>> !in_irq() && !in_serving_softirq()) || >>> !READ_ONCE(rcu_state.gp_flags) || >>> - !rcu_state.gp_kthread) >>> + !READ_ONCE(rcu_state.gp_kthread)) >>> return; >>> WRITE_ONCE(rcu_state.gp_wake_time, jiffies); >>> WRITE_ONCE(rcu_state.gp_wake_seq, READ_ONCE(rcu_state.gp_seq)); >>> @@ -3619,7 +3619,10 @@ static int __init rcu_spawn_gp_kthread(void) >>> } >>> rnp = rcu_get_root(); >>> raw_spin_lock_irqsave_rcu_node(rnp, flags); >>> - rcu_state.gp_kthread = t; >>> + WRITE_ONCE(rcu_state.gp_activity, jiffies); >>> + WRITE_ONCE(rcu_state.gp_req_activity, jiffies); >>> + // Reset .gp_activity and .gp_req_activity before setting .gp_kthread. >>> + smp_store_release(&rcu_state.gp_kthread, t); /* ^^^ */ >>> raw_spin_unlock_irqrestore_rcu_node(rnp, flags); >>> wake_up_process(t); >>> rcu_spawn_nocb_kthreads(); >>> diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h >>> index 476458c..75f6e9f 100644 >>> --- a/kernel/rcu/tree_stall.h >>> +++ b/kernel/rcu/tree_stall.h >>> @@ -578,6 +578,7 @@ void show_rcu_gp_kthreads(void) >>> unsigned long jw; >>> struct rcu_data *rdp; >>> struct rcu_node *rnp; >>> + struct task_struct *t = READ_ONCE(rcu_state.gp_kthread); >>> >>> j = jiffies; >>> ja = j - READ_ONCE(rcu_state.gp_activity); >>> @@ -585,8 +586,7 @@ void show_rcu_gp_kthreads(void) >>> jw = j - READ_ONCE(rcu_state.gp_wake_time); >>> pr_info("%s: wait state: %s(%d) ->state: %#lx delta ->gp_activity %lu ->gp_req_activity %lu ->gp_wake_time %lu ->gp_wake_seq %ld ->gp_seq %ld ->gp_seq_needed %ld ->gp_flags %#x\n", >>> rcu_state.name, gp_state_getname(rcu_state.gp_state), >>> - rcu_state.gp_state, >>> - rcu_state.gp_kthread ? rcu_state.gp_kthread->state : 0x1ffffL, >>> + rcu_state.gp_state, t ? t->state : 0x1ffffL, >>> ja, jr, jw, (long)READ_ONCE(rcu_state.gp_wake_seq), >>> (long)READ_ONCE(rcu_state.gp_seq), >>> (long)READ_ONCE(rcu_get_root()->gp_seq_needed), >>> @@ -633,7 +633,8 @@ static void rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp, >>> >>> if (!IS_ENABLED(CONFIG_PROVE_RCU) || rcu_gp_in_progress() || >>> ULONG_CMP_GE(READ_ONCE(rnp_root->gp_seq), >>> - READ_ONCE(rnp_root->gp_seq_needed))) >>> + READ_ONCE(rnp_root->gp_seq_needed)) || >>> + !smp_load_acquire(&rcu_state.gp_kthread)) >>> return; >>> j = jiffies; /* Expensive access, and in common case don't get here. */ >>> if (time_before(j, READ_ONCE(rcu_state.gp_req_activity) + gpssdelay) ||