Hi Paul, On Sat, Jan 25, 2020 at 10:42:56AM -0000, tip-bot2 for Paul E. McKenney wrote: > The following commit has been merged into the core/rcu branch of tip: > > Commit-ID: df1e849ae4559544ff00ff5052eefe2479750539 > Gitweb: https://git.kernel.org/tip/df1e849ae4559544ff00ff5052eefe2479750539 > Author: Paul E. McKenney <paulmck@xxxxxxxxxx> > AuthorDate: Wed, 27 Nov 2019 16:36:45 -08:00 > Committer: Paul E. McKenney <paulmck@xxxxxxxxxx> > CommitterDate: Mon, 09 Dec 2019 12:32:59 -08:00 > > rcu: Enable tick for nohz_full CPUs slow to provide expedited QS > > An expedited grace period can be stalled by a nohz_full CPU looping > in kernel context. This possibility is currently handled by some > carefully crafted checks in rcu_read_unlock_special() that enlist help > from ksoftirqd when permitted by the scheduler. However, it is exactly > these checks that require the scheduler avoid holding any of its rq or > pi locks across rcu_read_unlock() without also having held them across > the entire RCU read-side critical section. > > It would therefore be very nice if expedited grace periods could > handle nohz_full CPUs looping in kernel context without such checks. > This commit therefore adds code to the expedited grace period's wait > and cleanup code that forces the scheduler-clock interrupt on for CPUs > that fail to quickly supply a quiescent state. "Quickly" is currently > a hard-coded single-jiffy delay. > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx> > --- > include/linux/tick.h | 5 +++- > kernel/rcu/tree.h | 1 +- > kernel/rcu/tree_exp.h | 52 ++++++++++++++++++++++++++++++++++++------ > 3 files changed, 50 insertions(+), 8 deletions(-) ... > @@ -460,22 +489,31 @@ static void synchronize_rcu_expedited_wait(void) > unsigned long jiffies_start; > unsigned long mask; > int ndetected; > + struct rcu_data *rdp; > struct rcu_node *rnp; > struct rcu_node *rnp_root = rcu_get_root(); > - int ret; > > trace_rcu_exp_grace_period(rcu_state.name, rcu_exp_gp_seq_endval(), TPS("startwait")); > jiffies_stall = rcu_jiffies_till_stall_check(); > jiffies_start = jiffies; > + if (IS_ENABLED(CONFIG_NO_HZ_FULL)) { > + if (synchronize_rcu_expedited_wait_once(1)) > + return; > + rcu_for_each_leaf_node(rnp) { > + for_each_leaf_node_cpu_mask(rnp, cpu, rnp->expmask) { > + rdp = per_cpu_ptr(&rcu_data, cpu); > + if (rdp->rcu_forced_tick_exp) > + continue; > + rdp->rcu_forced_tick_exp = true; > + tick_dep_set_cpu(cpu, TICK_DEP_BIT_RCU_EXP); > + } > + } > + WARN_ON_ONCE(1); I'm hitting this on a big AMD box, CONFIG_NO_HZ_FULL=y. Kernel is Linus master + tip/master from today. ... [ 23.094781] BTRFS info (device sdb2): disk space caching is enabled [ 23.094784] BTRFS info (device sdb2): has skinny extents [ 23.139134] BTRFS info (device sdb2): enabling ssd optimizations [ 23.395434] ------------[ cut here ]------------ [ 23.402616] WARNING: CPU: 6 PID: 1921 at kernel/rcu/tree_exp.h:511 rcu_exp_wait_wake+0x11f/0x740 [ 23.402616] Modules linked in: btrfs(E) libcrc32c(E) xor(E) hid_generic(E) usbhid(E) raid6_pq(E) sd_mod(E) crc32c_intel(E) ast(E) i2c_algo_bit(E) drm_vram_helper(E) drm_ttm_helper(E) ttm(E) ahci(E) drm_kms_helper(E) syscopyarea(E) libahci(E) sysfillrect(E) xhci_pci(E) sysimgblt(E) fb_sys_fops(E) xhci_hcd(E) libata(E) drm(E) usbcore(E) wmi(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) efivarfs(E) [ 23.402630] CPU: 6 PID: 1921 Comm: kworker/6:2 Tainted: G E 5.5.0-rc7+ #1 [ 23.402632] Workqueue: rcu_gp wait_rcu_exp_gp [ 23.402634] RIP: 0010:rcu_exp_wait_wake+0x11f/0x740 [ 23.402636] Code: 3b 45 6c 7e be 48 63 05 8b 30 50 01 49 81 c5 40 02 00 00 48 8d 04 c0 48 c1 e0 06 48 05 80 b2 67 b5 49 39 c5 0f 82 72 ff ff ff <0f> 0b 49 c7 c7 00 fd 02 00 bb 01 00 00 00 4c 89 e7 e8 ab e7 ff ff [ 23.402637] RSP: 0018:ffffbde38a8f3e50 EFLAGS: 00010246 [ 23.402637] RAX: ffffffffb567d8c0 RBX: ffffffffb567d6c0 RCX: 0000000000000040 [ 23.402638] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffb567d6c0 [ 23.402638] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000001 [ 23.402639] R10: ffffbde3802cfdc0 R11: 0000000000000271 R12: 0000000000003a98 [ 23.402640] R13: ffffffffb567d8c0 R14: 00000000000000f0 R15: 000000000002fd00 [ 23.402640] FS: 0000000000000000(0000) GS:ffffa016cd380000(0000) knlGS:0000000000000000 [ 23.402641] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 23.402641] CR2: 00007fc9957c5180 CR3: 0000800bb3e0a000 CR4: 0000000000340ee0 [ 23.402642] Call Trace: [ 23.402647] ? sync_rcu_exp_select_cpus+0x219/0x3e0 [ 23.402650] process_one_work+0x20b/0x400 [ 23.402652] worker_thread+0x2d/0x3f0 [ 23.402653] ? process_one_work+0x400/0x400 [ 23.402655] kthread+0x10d/0x130 [ 23.402656] ? kthread_park+0x90/0x90 [ 23.402660] ret_from_fork+0x27/0x50 [ 23.402663] ---[ end trace 85fb288edc35e984 ]--- [ 24.322452] BTRFS info (device sdb2): disk space caching is enabled Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette