Re: Request for backport fd6bc19d7676 to 4.14 and 4.19 branch

Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> · Thu, 23 Sep 2021 09:52:47 +0200

On Thu, Aug 19, 2021 at 12:28:40AM +0000, David Chen wrote:
> 
> 
> > -----Original Message-----
> > From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > Sent: Tuesday, August 17, 2021 11:55 PM
> > To: David Chen <david.chen@xxxxxxxxxxx>
> > Cc: stable@xxxxxxxxxxxxxxx; Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>; neeraju@xxxxxxxxxxxxxx
> > Subject: Re: Request for backport fd6bc19d7676 to 4.14 and 4.19 branch
> > 
> > On Tue, Aug 17, 2021 at 06:47:45PM +0000, David Chen wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > > Sent: Monday, August 16, 2021 11:16 PM
> > > > To: David Chen <david.chen@xxxxxxxxxxx>
> > > > Cc: stable@xxxxxxxxxxxxxxx; Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>; neeraju@xxxxxxxxxxxxxx
> > > > Subject: Re: Request for backport fd6bc19d7676 to 4.14 and 4.19 branch
> > > >
> > > > On Mon, Aug 16, 2021 at 10:02:28PM +0000, David Chen wrote:
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > > > > Sent: Monday, August 16, 2021 12:31 PM
> > > > > > To: David Chen <david.chen@xxxxxxxxxxx>
> > > > > > Cc: stable@xxxxxxxxxxxxxxx; Paul E. McKenney
> > > > > > <paulmck@xxxxxxxxxxxxxxxxxx>; neeraju@xxxxxxxxxxxxxx
> > > > > > Subject: Re: Request for backport fd6bc19d7676 to 4.14 and 4.19 branch
> > > > > >
> > > > > > On Mon, Aug 16, 2021 at 07:19:34PM +0000, David Chen wrote:
> > > > > > > Hi Greg,
> > > > > > >
> > > > > > > We recently hit a hung task timeout issue in synchronize_rcu_expedited on
> > > > > > 4.14 branch.
> > > > > > > The issue seems to be identical to the one described in `fd6bc19d7676
> > > > > > > rcu: Fix missed wakeup of exp_wq waiters` Can we backport it to 4.14 and
> > > > > > 4.19 branch?
> > > > > > > The patch doesn't apply cleanly, but it should be trivial to resolve,
> > > > > > > just do this
> > > > > > >
> > > > > > > -		wake_up_all(&rnp->exp_wq[rcu_seq_ctr(rsp-
> > > > > > >expedited_sequence) & 0x3]);
> > > > > > > +		wake_up_all(&rnp->exp_wq[rcu_seq_ctr(s) & 0x3]);
> > > > > > >
> > > > > > > I don't know if we should do it for 4.9, because the handling of sequence
> > > > > > number is a bit different.
> > > > > >
> > > > > > Please provide a working backport, me hand-editing patches does not scale,
> > > > > > and this way you get the proper credit for backporting it (after testing it).
> > > > >
> > > > > Sure, appended at the end.
> > > > >
> > > > > >
> > > > > > You have tested, this, right?
> > > > >
> > > > > I don't have a good repro for the original issue, so I only ran rcutorture and
> > > > > some basic work load test to see if anything obvious went wrong.
> > > >
> > > > Ideally you would be able to also hit this without the patch on the
> > > > older kernels, is this the case?
> > > >
> > > So far we've only seen this once. I was able to figure out the issue from the vmcore,
> > > but I haven't been able to reproduce this. I think the nature of the bug makes it
> > > very difficult to hit. It requires a race with synchronize_rcu_expedited but once
> > > the thread hangs, you can't call it again, because it might rescue the hung thread.
> > 
> > I would like a bit more verification that this is really needed, and
> > some acks from the developers/maintainers involved, before accepting
> > this change.
> > 
> https://lkml.org/lkml/2019/11/18/184
> >From the original discussion, Neeraj said they hit the issue on 4.9, 4.14 and 4.19 as well.
> I also tried running with the "WARN_ON(s_low != exp_low);" mentioned above without
> the fix, and force a schedule before "mutex_lock(&rsp->exp_wake_mutex);" to simulate
> a random latency from running on VM. I was able to trigger the warning.
> 
> [  162.760480] WARNING: CPU: 2 PID: 1129 at kernel/rcu/tree_exp.h:549 rcu_exp_wait_wake+0x4a5/0x6c0
> [  162.760482] Modules linked in: rcutorture torture nls_utf8 isofs nf_log_ipv6 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables nf_log_ipv4 nf_log_common xt_LOG xt_limit ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c iptable_filter sunrpc crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc ttm aesni_intel crypto_simd drm_kms_helper drm sg joydev syscopyarea sysfillrect virtio_balloon sysimgblt fb_sys_fops i2c_piix4 input_leds pcspkr qemu_fw_cfg loop binfmt_misc ip_tables ext4 mbcache jbd2 sd_mod sr_mod cdrom ata_generic pata_acpi virtio_net virtio_scsi ata_piix virtio_pci serio_raw libata virtio_ring virtio floppy dm_mirror dm_region_hash dm_log sha3_generic authenc cmac wp512 twofish_generic twofish_x86_64 twofish_common
> [  162.760509]  tea sha512_ssse3 sha512_generic sha256_ssse3 sha1_ssse3 serpent_avx2 serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic seed salsa20_generic rmd320 rmd256 rmd160 rmd128 michael_mic md4 khazad fcrypt dm_crypt dm_mod dax des_generic deflate cts crc32c_intel ccm cast6_avx_x86_64 cast6_generic cast_common camellia_generic ablk_helper cryptd xts lrw glue_helper blowfish_generic blowfish_common arc4 ansi_cprng fuse [last unloaded: rcu_kprobe]
> [  162.760524] CPU: 2 PID: 1129 Comm: kworker/2:3 Tainted: G        W  O    4.14.243-1.nutanix.20210810.test.el7.x86_64 #1
> [  162.760524] Hardware name: Nutanix AHV, BIOS 1.11.0-2.el7 04/01/2014
> [  162.760525] Workqueue: events wait_rcu_exp_gp
> [  162.760526] task: ffffa083e92745c0 task.stack: ffffb29442cb8000
> [  162.760527] RIP: 0010:rcu_exp_wait_wake+0x4a5/0x6c0
> [  162.760527] RSP: 0018:ffffb29442cbbde8 EFLAGS: 00010206
> [  162.760528] RAX: 0000000000000000 RBX: ffffffff932b43c0 RCX: 0000000000000000
> [  162.760529] RDX: 0000000000000000 RSI: 0000000000000286 RDI: 0000000000000286
> [  162.760529] RBP: ffffb29442cbbe58 R08: ffffffff932b43c0 R09: ffffb29442cbbd70
> [  162.760530] R10: ffffb29442cbbba0 R11: 000000000000011b R12: ffffffff932b2440
> [  162.760531] R13: 000000000000157c R14: ffffffff932b4240 R15: 0000000000000003
> [  162.760531] FS:  0000000000000000(0000) GS:ffffa083efa80000(0000) knlGS:0000000000000000
> [  162.760532] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  162.760533] CR2: 00007f6d6d5160c8 CR3: 000000002320a001 CR4: 00000000001606e0
> [  162.760535] Call Trace:
> [  162.760537]  ? cpu_needs_another_gp+0x70/0x70
> [  162.760538]  wait_rcu_exp_gp+0x2b/0x30
> [  162.760539]  process_one_work+0x18f/0x3c0
> [  162.760540]  worker_thread+0x35/0x3c0
> [  162.760541]  kthread+0x128/0x140
> [  162.760542]  ? process_one_work+0x3c0/0x3c0
> [  162.760543]  ? __kthread_cancel_work+0x50/0x50
> [  162.760544]  ret_from_fork+0x35/0x40
> [  162.760545] Code: 4c 24 30 49 8b 94 24 10 13 04 00 48 c7 c7 d0 d7 05 93 0f 95 c0 48 2b 75 a8 44 0f be 80 d8 d2 05 93 e8 99 2f 70 00 e9 ae fe ff ff <0f> 0b e9 ec fc ff ff 65 8b 05 2d 40 f1 6d 89 c0 48 0f a3 05 d3
> [  162.760570] ---[ end trace 2cc2ddd257a55220 ]---
> 
> The warning triggered mean that the waker skipped the slot it's supposed to do wake_up_all on,
> and would result in possible missed wake up issue.

Ok, now queued up, thanks.

greg k-h