On 18/07/2021 22:59, Paul E. McKenney wrote: > On Sun, Jul 18, 2021 at 11:03:51PM +0200, Oleksandr Natalenko wrote: >> + stable@xxxxxxxxxxxxxxx >> >> On neděle 18. července 2021 23:01:24 CEST Oleksandr Natalenko wrote: >>> Hello. >>> >>> On sobota 17. července 2021 22:22:08 CEST Chris Clayton wrote: >>>> I checked the output from dmesg yesterday and found the following warning: >>>> >>>> [Fri Jul 16 09:15:29 2021] ------------[ cut here ]------------ >>>> [Fri Jul 16 09:15:29 2021] WARNING: CPU: 11 PID: 2701 at >>>> kernel/rcu/tree_plugin.h:359 rcu_note_context_switch+0x37/0x3d0 [Fri Jul >>>> 16 > > I am not seeing a warning at line 359 of either v5.13.2 or v5.12.7. > Mmm, in the 5.13.2 tarball downloaded from https://cdn.kernel.org/pub/linux/kernel/v5.x/ I see: 350 */ 351 void rcu_note_context_switch(bool preempt) 352 { 353 struct task_struct *t = current; 354 struct rcu_data *rdp = this_cpu_ptr(&rcu_data); 355 struct rcu_node *rnp; 356 357 trace_rcu_utilization(TPS("Start context switch")); 358 lockdep_assert_irqs_disabled(); 359 WARN_ON_ONCE(!preempt && rcu_preempt_depth() > 0); 360 if (rcu_preempt_depth() > 0 && 361 !t->rcu_read_unlock_special.b.blocked) { 362 >>>> 09:15:29 2021] Modules linked in: uas hidp rfcomm bnep xt_MASQUERADE >>>> iptable_nat nf_nat xt_LOG nf_log_syslog xt_limit xt_multiport xt_conntrack >>>> iptable_filter btusb btintel wmi_bmof uvcvideo videobuf2_vmalloc >>>> videobuf2_memops videobuf2_v4l2 videobuf2_common coretemp hwmon >>>> snd_hda_codec_hdmi x86_pkg_temp_thermal snd_hda_codec_realtek >>>> snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg >>>> snd_hda_codec snd_hwdep snd_hda_core i2c_i801 i2c_smbus iwlmvm mac80211 >>>> iwlwifi i915 mei_me mei cfg80211 intel_lpss_pci intel_lpss wmi >>>> nf_conntrack_ftp xt_helper nf_conntrack nf_defrag_ipv4 tun >>>> [Fri Jul 16 09:15:29 2021] CPU: 11 PID: 2701 Comm: lpqd Not tainted 5.13.2 >>>> #1 [Fri Jul 16 09:15:29 2021] Hardware name: Notebook >>>> >>>> NP50DE_DB /NP50DE_DB , BIOS 1.07.04 02/17/2020 >>>> >>>> [Fri Jul 16 09:15:29 2021] RIP: 0010:rcu_note_context_switch+0x37/0x3d0 >>>> [Fri Jul 16 09:15:29 2021] Code: 02 00 e8 ec a0 6c 00 89 c0 65 4c 8b 2c 25 >>>> 00 6d 01 00 48 03 1c c5 80 56 e1 b6 40 84 ed 75 0d 41 8b 95 04 03 00 00 85 >>>> d2 7e 02 <0f> 0b 65 48 8b 04 25 00 6d 01 00 8b 80 04 03 00 00 85 c0 7e 0a >>>> 41 [Fri Jul 16 09:15:29 2021] RSP: 0000:ffffb5d483837c70 EFLAGS: 00010002 >>>> [Fri Jul 16 09:15:29 2021] RAX: 000000000000000b RBX: ffff9b77806e1d80 >>>> RCX: >>>> 0000000000000100 [Fri Jul 16 09:15:29 2021] RDX: 0000000000000001 RSI: >>>> ffffffffb6d82ead RDI: ffffffffb6da5e4e [Fri Jul 16 09:15:29 2021] RBP: >>>> 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [Fri Jul 16 >>>> 09:15:29 2021] R10: 000000067bce4fff R11: 0000000000000000 R12: >>>> ffff9b77806e1100 [Fri Jul 16 09:15:29 2021] R13: ffff9b734a833a00 R14: >>>> ffff9b734a833a00 R15: 0000000000000000 [Fri Jul 16 09:15:29 2021] FS: >>>> 00007fccbfc5fe40(0000) GS:ffff9b77806c0000(0000) knlGS:0000000000000000 >>>> [Fri Jul 16 09:15:29 2021] CS: 0010 DS: 0000 ES: 0000 CR0: >>>> 0000000080050033 [Fri Jul 16 09:15:29 2021] CR2: 00007fccc2db7290 CR3: >>>> 00000003fb0b8002 CR4: 00000000007706e0 [Fri Jul 16 09:15:29 2021] PKRU: >>>> 55555554 >>>> [Fri Jul 16 09:15:29 2021] Call Trace: >>>> [Fri Jul 16 09:15:29 2021] __schedule+0x86/0x810 >>>> [Fri Jul 16 09:15:29 2021] schedule+0x40/0xe0 >>>> [Fri Jul 16 09:15:29 2021] io_schedule+0x3d/0x60 >>>> [Fri Jul 16 09:15:29 2021] wait_on_page_bit_common+0x129/0x390 >>>> [Fri Jul 16 09:15:29 2021] ? __filemap_set_wb_err+0x10/0x10 >>>> [Fri Jul 16 09:15:29 2021] __lock_page_or_retry+0x13f/0x1d0 >>>> [Fri Jul 16 09:15:29 2021] do_swap_page+0x335/0x5b0 >>>> [Fri Jul 16 09:15:29 2021] __handle_mm_fault+0x444/0xb20 >>>> [Fri Jul 16 09:15:29 2021] handle_mm_fault+0x5c/0x170 >>>> [Fri Jul 16 09:15:29 2021] ? find_vma+0x5b/0x70 >>>> [Fri Jul 16 09:15:29 2021] exc_page_fault+0x1ab/0x610 >>>> [Fri Jul 16 09:15:29 2021] ? fpregs_assert_state_consistent+0x19/0x40 >>>> [Fri Jul 16 09:15:29 2021] ? asm_exc_page_fault+0x8/0x30 >>>> [Fri Jul 16 09:15:29 2021] asm_exc_page_fault+0x1e/0x30 >>>> [Fri Jul 16 09:15:29 2021] RIP: 0033:0x7fccc2d3c520 >>>> [Fri Jul 16 09:15:29 2021] Code: 68 4c 00 00 00 e9 20 fb ff ff ff 25 7a ad >>>> 07 00 68 4d 00 00 00 e9 10 fb ff ff ff 25 72 ad 07 00 68 4e 00 00 00 e9 00 >>>> fb ff ff <ff> 25 6a ad 07 00 68 4f 00 00 00 e9 f0 fa ff ff ff 25 62 ad 07 >>>> 00 [Fri Jul 16 09:15:29 2021] RSP: 002b:00007ffebd529048 EFLAGS: 00010293 >>>> [Fri Jul 16 09:15:29 2021] RAX: 0000000000000001 RBX: 00007fccc46e2890 >>>> RCX: >>>> 0000000000000010 [Fri Jul 16 09:15:29 2021] RDX: 0000000000000010 RSI: >>>> 0000000000000000 RDI: 00007fccc46e2890 [Fri Jul 16 09:15:29 2021] RBP: >>>> 000056264f1dd4a0 R08: 000056264f21aba0 R09: 000056264f1f58a0 [Fri Jul 16 >>>> 09:15:29 2021] R10: 0000000000000007 R11: 0000000000000246 R12: >>>> 000056264f21ac00 [Fri Jul 16 09:15:29 2021] R13: 000056264f1e0a30 R14: >>>> 00007ffebd529080 R15: 00000000000dd87b [Fri Jul 16 09:15:29 2021] ---[ end >>>> trace c8b06e067d8b0fc2 ]--- >>>> >>>> At the time the warning was issued I was creating a (weekly) backup of my >>>> linux system (home-brewed based on the guidance from Linux From Scratch). >>>> My backup routine is completed by copying the archive files (created with >>>> dar) and a directory that contains about 7000 source and binary rpm files >>>> to an external USB drive. I didn't spot the warning until later in the >>>> day, >>>> so I'm not sure exactly where I was in my backup process. >>>> >>>> I haven't seen this warning before. Consequently, I don;t know how easy >>>> (or >>>> otherwise) it is to reproduce. >>>> >>>> Let me know if I can provide any additional diagnostics, but please cc me >>>> as I'm not subscribed. >>> >>> Confirming the same for me with v5.13.2, and cross-referencing another >>> report [1] against v5.12.17. >>> >>> Also Cc'ing relevant people on this. >>> >>> Thanks. >>> >>> [1] >>> https://lore.kernel.org/lkml/CAK2bqVK0Q9YcpakE7_Rc6nr-E4e2GnMOgi5jJj=_Eh_1k >>> EHLHA@xxxxxxxxxxxxxx/ > > But this one does show this warning in v5.12.17: > > WARN_ON_ONCE(!preempt && rcu_preempt_depth() > 0); > > This is in rcu_note_context_switch(), and could be caused by something > like a schedule() within an RCU read-side critical section. This would > of course be RCU-usage bugs, given that you are not permitted to block > within an RCU read-side critical section. > > I suggest checking the functions in the stack trace to see where the > rcu_read_lock() is hiding. CONFIG_PROVE_LOCKING might also be helpful. > > Thanx, Paul >