Re: mdadm resync causes stable system to crash every 2 or 3 hours

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 7, 2021 at 5:18 AM Roman Mamedov <rm@xxxxxxxxxxx> wrote:
>
> On Tue, 7 Sep 2021 12:52:01 +0500
> Roman Mamedov <rm@xxxxxxxxxxx> wrote:
>
> > On Mon, 6 Sep 2021 20:44:31 -0400
> > Ryan Patterson <ryan.goat@xxxxxxxxx> wrote:
> >
> > > My file server is usually very stable.  The past week I had two mdadm
> > > arrays that required recync operations.
> > > * newly created raid6 array (14 x 16TB seagate exos)
> > > * existing raid 6 array, after a reboot resync on hot spare (14 x 4TB
> > > seagate barracuda)
> > >
> > > During both resync operations (they ran one at a time) the system
> > > would routinely experience a major error and require a hard reboot,
> > > every two or three hours.  I saw several errors, such as:
> > > * kernel watchdog soft lockups [md127_raid6:364]
> > > * general protection faults (I have a few saved with the full exception stack)
> > > * exceptions in iommu routines (again I have the full error with
> > > exception stack saved)
> > > * full system lockup
> >
> > So in other words the server is very stable, unless asked to do full-speed
> > reads from all disks at the same time.
> >
> > I'd suggest to check or improve cooling on the HBA cards, and then try a
> > different PSU.
>
> Also the motherboard chipset cooling, since that's a lot of PCI-E traffic.
> Maybe the CPU cooling as well, or at least check the CPU temperatures during
> this load.
>
> And since you have full logs and backtraces, there's no point in waiting to
> post those, just go ahead. Maybe they will point to something other than
> suspect hardware, or at least to which part of hardware to suspect.
>
> --
> With respect,
> Roman

Thanks for the suggestions.  Hardware overheating might be my problem.
I have several (loud) case fans blowing away.  But the HBA cards and
mobo southbridge are only passively cooled.  Maybe I could mount fans
on each cards' headsink.  I'll investigate.

The power supply is not an off the shelf job.  So I don't convientanly
have a replacement to try.  I might have to bite the bullet and buy a
second.

I forgot to put in my original note that I ran memtest86 on this
machine for four full cycles with no faults found.  Also nothing is
overclocked.

Here are some of the errors I recorded.  Maybe somebody can see a
pattern in them...

[Thu Sep  2 22:05:25 2021] md: resync of RAID array md127
[Thu Sep  2 23:58:58 2021] perf: interrupt took too long (2570 >
2500), lowering kernel.perf_event_max_sample_rate to 77750
[Fri Sep  3 02:14:38 2021] general protection fault, probably for
non-canonical address 0x1000000000000: 0000 [#1] SMP PTI
[Fri Sep  3 02:14:38 2021] CPU: 3 PID: 371 Comm: md127_raid6 Not
tainted 5.10.0-8-amd64 #1 Debian 5.10.46-4
[Fri Sep  3 02:14:38 2021] Hardware name: Gigabyte Technology Co.,
Ltd. Z87X-UD4H/Z87X-UD4H-CF, BIOS F9 03/18/2014
[Fri Sep  3 02:14:38 2021] RIP: 0010:bio_associate_blkg+0x17/0x70
[Fri Sep  3 02:14:38 2021] Code: dc fe ff ff 48 89 c3 e9 05 ff ff ff
0f 1f 80 00 00 00 00 0f 1f 44 00 00 48 83 ec 08 48 8b 47 48 48 85 c0
74 17 48 85 ff 74 3d <48> 8b 70 28 e8 d0 fc ff ff 48 83 c4 08 e9 b7 c1
cc ff 48 89 3c 24
[Fri Sep  3 02:14:38 2021] RSP: 0018:ffffbde280747b58 EFLAGS: 00010286
[Fri Sep  3 02:14:38 2021] RAX: 0001000000000000 RBX: ffff95f8334a8000
RCX: 0000000000000000
[Fri Sep  3 02:14:38 2021] RDX: 0000000000000001 RSI: 00000000000000c0
RDI: ffff95f8334a8c60
[Fri Sep  3 02:14:38 2021] RBP: ffff95f700dfb400 R08: 0000000000000001
R09: 00000000fffffffd
[Fri Sep  3 02:14:38 2021] R10: 0000000000000000 R11: ffff95f8334a8c60
R12: 0000000000000b80
[Fri Sep  3 02:14:38 2021] R13: 0000000000000000 R14: ffff95f741a09000
R15: ffff95f8334a8000
[Fri Sep  3 02:14:38 2021] FS:  0000000000000000(0000)
GS:ffff95fe1f2c0000(0000) knlGS:0000000000000000
[Fri Sep  3 02:14:38 2021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Sep  3 02:14:38 2021] CR2: 00007f3cbf016514 CR3: 00000005c740a001
CR4: 00000000001706e0
[Fri Sep  3 02:14:38 2021] Call Trace:
[Fri Sep  3 02:14:38 2021]  ops_run_io+0x43d/0xd20 [raid456]
[Fri Sep  3 02:14:38 2021]  ? irq_init_percpu_irqstack+0x80/0x100
[Fri Sep  3 02:14:38 2021]  handle_stripe+0xd9e/0x1fd0 [raid456]
[Fri Sep  3 02:14:38 2021]
handle_active_stripes.constprop.0+0x3af/0x590 [raid456]
[Fri Sep  3 02:14:38 2021]  raid5d+0x375/0x5a0 [raid456]
[Fri Sep  3 02:14:38 2021]  md_thread+0x94/0x160 [md_mod]
[Fri Sep  3 02:14:38 2021]  ? add_wait_queue_exclusive+0x70/0x70
[Fri Sep  3 02:14:38 2021]  ? md_write_inc+0x50/0x50 [md_mod]
[Fri Sep  3 02:14:38 2021]  kthread+0x11b/0x140
[Fri Sep  3 02:14:38 2021]  ? __kthread_bind_mask+0x60/0x60
[Fri Sep  3 02:14:38 2021]  ret_from_fork+0x22/0x30
[Fri Sep  3 02:14:38 2021] Modules linked in: ipt_REJECT
nf_reject_ipv4 xt_multiport nft_compat nft_counter nf_tables nfnetlink
rfkill nls_ascii nls_cp437 vfat fat snd_hda_codec_realtek
snd_hda_codec_generic mei_hdcp intel_rapl_msr snd_hda_codec_hdmi led
trig_audio snd_hda_intel snd_intel_dspcfg soundwire_intel
soundwire_generic_allocation snd_soc_core snd_compress
soundwire_cadence snd_hda_codec snd_hda_core snd_hwdep soundwire_bus
snd_pcm intel_rapl_common x86_pkg_temp_thermal intel_powerclamp
coretemp
 kvm_intel iTCO_wdt snd_timer kvm sg mei_me mei snd intel_pmc_bxt
soundcore irqbypass iTCO_vendor_support at24 watchdog evdev pcspkr
efi_pstore rapl intel_cstate intel_uncore sunrpc msr fuse configfs
efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache
 jbd2 btrfs blake2b_generic dm_crypt dm_mod raid10 raid1 raid0
multipath linear raid456 async_raid6_recov async_memcpy async_pq
async_xor async_tx xor raid6_pq libcrc32c crc32c_generic md_mod sd_mod
t10_pi crc_t10dif crct10dif_generic hid_generic usbhid
hid
[Fri Sep  3 02:14:38 2021]  i915 crct10dif_pclmul crct10dif_common
crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel i2c_algo_bit
drm_kms_helper libaes mxm_wmi crypto_simd ahci libahci mpt3sas
i2c_i801 cryptd xhci_pci glue_helper i2c_smbus cec xh
ci_hcd libata drm raid_class ehci_pci scsi_transport_sas lpc_ich
ehci_hcd e1000e scsi_mod usbcore ptp pps_core usb_common fan wmi video
button
[Fri Sep  3 02:14:38 2021] ---[ end trace 5cc209277d97aabd ]---
[Fri Sep  3 02:14:38 2021] RIP: 0010:bio_associate_blkg+0x17/0x70
[Fri Sep  3 02:14:38 2021] Code: dc fe ff ff 48 89 c3 e9 05 ff ff ff
0f 1f 80 00 00 00 00 0f 1f 44 00 00 48 83 ec 08 48 8b 47 48 48 85 c0
74 17 48 85 ff 74 3d <48> 8b 70 28 e8 d0 fc ff ff 48 83 c4 08 e9 b7 c1
cc ff 48 89 3c 24
[Fri Sep  3 02:14:38 2021] RSP: 0018:ffffbde280747b58 EFLAGS: 00010286
[Fri Sep  3 02:14:38 2021] RAX: 0001000000000000 RBX: ffff95f8334a8000
RCX: 0000000000000000
[Fri Sep  3 02:14:38 2021] RDX: 0000000000000001 RSI: 00000000000000c0
RDI: ffff95f8334a8c60
[Fri Sep  3 02:14:38 2021] RBP: ffff95f700dfb400 R08: 0000000000000001
R09: 00000000fffffffd
[Fri Sep  3 02:14:38 2021] R10: 0000000000000000 R11: ffff95f8334a8c60
R12: 0000000000000b80
[Fri Sep  3 02:14:38 2021] R13: 0000000000000000 R14: ffff95f741a09000
R15: ffff95f8334a8000
[Fri Sep  3 02:14:38 2021] FS:  0000000000000000(0000)
GS:ffff95fe1f2c0000(0000) knlGS:0000000000000000
[Fri Sep  3 02:14:38 2021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Sep  3 02:14:38 2021] CR2: 00007f3cbf016514 CR3: 000000010828e003
CR4: 00000000001706e0
[Fri Sep  3 02:14:38 2021] ------------[ cut here ]------------
[Fri Sep  3 02:14:38 2021] WARNING: CPU: 3 PID: 371 at
kernel/exit.c:725 do_exit+0x47/0xaa0
[Fri Sep  3 02:14:38 2021] Modules linked in: ipt_REJECT
nf_reject_ipv4 xt_multiport nft_compat nft_counter nf_tables nfnetlink
rfkill nls_ascii nls_cp437 vfat fat snd_hda_codec_realtek
snd_hda_codec_generic mei_hdcp intel_rapl_msr snd_hda_codec_hdmi led
trig_audio snd_hda_intel snd_intel_dspcfg soundwire_intel
soundwire_generic_allocation snd_soc_core snd_compress
soundwire_cadence snd_hda_codec snd_hda_core snd_hwdep soundwire_bus
snd_pcm intel_rapl_common x86_pkg_temp_thermal intel_powerclamp
coretemp
 kvm_intel iTCO_wdt snd_timer kvm sg mei_me mei snd intel_pmc_bxt
soundcore irqbypass iTCO_vendor_support at24 watchdog evdev pcspkr
efi_pstore rapl intel_cstate intel_uncore sunrpc msr fuse configfs
efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache
 jbd2 btrfs blake2b_generic dm_crypt dm_mod raid10 raid1 raid0
multipath linear raid456 async_raid6_recov async_memcpy async_pq
async_xor async_tx xor raid6_pq libcrc32c crc32c_generic md_mod sd_mod
t10_pi crc_t10dif crct10dif_generic hid_generic usbhid
hid
[Fri Sep  3 02:14:38 2021]  i915 crct10dif_pclmul crct10dif_common
crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel i2c_algo_bit
drm_kms_helper libaes
mxm_wmi crypto_simd ahci libahci mpt3sas i2c_i801 cryptd xhci_pci
glue_helper i2c_smbus cec xh
ci_hcd libata drm raid_class ehci_pci scsi_transport_sas lpc_ich
ehci_hcd e1000e scsi_mod usbcore ptp pps_core usb_common fan wmi video
button
[Fri Sep  3 02:14:38 2021] CPU: 3 PID: 371 Comm: md127_raid6 Tainted:
G      D           5.10.0-8-amd64 #1 Debian 5.10.46-4
[Fri Sep  3 02:14:38 2021] Hardware name: Gigabyte Technology Co.,
Ltd. Z87X-UD4H/Z87X-UD4H-CF, BIOS F9 03/18/2014
[Fri Sep  3 02:14:38 2021] RIP: 0010:do_exit+0x47/0xaa0
[Fri Sep  3 02:14:38 2021] Code: ec 40 65 48 8b 04 25 28 00 00 00 48
89 44 24 38 31 c0 48 8b 83 b8 0b 00 00 48 85 c0 74 0e 48 8b 10 48 39
d0 0f 84 64 04 00 00 <
0f> 0b 65 8b 0d d0 b5 38 68 89 c8 25 00 ff ff 00 89 44 24 0c 0f 85
[Fri Sep  3 02:14:38 2021] RSP: 0018:ffffbde280747ee0 EFLAGS: 00010216
[Fri Sep  3 02:14:38 2021] RAX: ffffbde280747e50 RBX: ffff95f7b8619780
RCX: ffff95fe1f2d8a08
[Fri Sep  3 02:14:38 2021] RDX: ffff95f769981fc8 RSI: 0000000000000027
RDI: 000000000000000b
[Fri Sep  3 02:14:38 2021] RBP: 000000000000000b R08: 0000000000000000
R09: ffffbde280747798
[Fri Sep  3 02:14:38 2021] R10: ffffbde280747790 R11: ffffffff992cb3e8
R12: 000000000000000b
[Fri Sep  3 02:14:38 2021] R13: 0000000000000000 R14: ffff95f7b8619780
R15: 0000000000000000
[Fri Sep  3 02:14:38 2021] FS:  0000000000000000(0000)
GS:ffff95fe1f2c0000(0000) knlGS:0000000000000000
[Fri Sep  3 02:14:38 2021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Sep  3 02:14:38 2021] CR2: 00007f3cbf016514 CR3: 000000010828e003
CR4: 00000000001706e0
[Fri Sep  3 02:14:38 2021] Call Trace:
[Fri Sep  3 02:14:38 2021]  ? md_write_inc+0x50/0x50 [md_mod]
[Fri Sep  3 02:14:38 2021]  ? kthread+0x11b/0x140
[Fri Sep  3 02:14:38 2021]  rewind_stack_do_exit+0x17/0x20
[Fri Sep  3 02:14:38 2021] RIP: 0000:0x0
[Fri Sep  3 02:14:38 2021] Code: Unable to access opcode bytes at RIP
0xffffffffffffffd6.
[Fri Sep  3 02:14:38 2021] RSP: 0000:0000000000000000 EFLAGS: 00000000
ORIG_RAX: 0000000000000000
[Fri Sep  3 02:14:38 2021] RAX: 0000000000000000 RBX: 0000000000000000
RCX: 0000000000000000
[Fri Sep  3 02:14:38 2021] RDX: 0000000000000000 RSI: 0000000000000000
RDI: 0000000000000000
[Fri Sep  3 02:14:38 2021] RBP: 0000000000000000 R08: 0000000000000000
R09: 0000000000000000
[Fri Sep  3 02:14:38 2021] R10: 0000000000000000 R11: 0000000000000000
R12: 0000000000000000
[Fri Sep  3 02:14:38 2021] R13: 0000000000000000 R14: 0000000000000000
R15: 0000000000000000
[Fri Sep  3 02:14:38 2021] ---[ end trace 5cc209277d97aabe ]---

Message from syslogd@wxyz at Sep  3 07:30:26 ...
 kernel:[ 3191.941392] watchdog: BUG: soft lockup - CPU#2 stuck for
22s! [md127_raid6:364]

Message from syslogd@wxyz at Sep  3 07:30:26 ...
 kernel:[ 3191.941392] watchdog: BUG: soft lockup - CPU#2 stuck for
22s! [md127_raid6:364]

[Fri Sep  3 09:05:56 2021] ------------[ cut here ]------------
[Fri Sep  3 09:05:56 2021] WARNING: CPU: 7 PID: 284 at
drivers/iommu/intel/iommu.c:2440 __domain_mapping.cold+0x4e/0x55
[Fri Sep  3 09:05:56 2021] Modules linked in: xfs ipt_REJECT
nf_reject_ipv4 xt_multiport nft_compat nft_counter nf_tables nfnetlink
rfkill nls_ascii nls_cp437 v
fat fat intel_rapl_msr mei_hdcp snd_hda_codec_realtek
snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel
snd_intel_dspcfg soundwire_intel inte
l_rapl_common soundwire_generic_allocation snd_soc_core snd_compress
x86_pkg_temp_thermal intel_powerclamp soundwire_cadence snd_hda_codec
snd_hda_core coretemp
 snd_hwdep soundwire_bus snd_pcm snd_timer kvm_intel snd iTCO_wdt
intel_pmc_bxt iTCO_vendor_support watchdog soundcore kvm sg mei_me mei
at24 irqbypass rapl int
el_cstate intel_uncore pcspkr evdev efi_pstore sunrpc msr fuse
configfs efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2
btrfs blake2b_generic dm_cry
pt dm_mod raid10 raid1 raid0 multipath linear raid456
async_raid6_recov async_memcpy async_pq async_xor async_tx xor
raid6_pq libcrc32c crc32c_generic md_mod sd
_mod t10_pi crc_t10dif crct10dif_generic hid_generic usbhid
[Fri Sep  3 09:05:56 2021]  hid i915 crct10dif_pclmul crct10dif_common
crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel xhci_pci
ahci libahci i2c_algo_
bit mxm_wmi xhci_hcd drm_kms_helper libaes crypto_simd libata cryptd
glue_helper mpt3sas cec ehci_pci ehci_hcd drm i2c_i801 i2c_smbus
raid_class usbcore scsi_tr
ansport_sas e1000e lpc_ich scsi_mod ptp pps_core usb_common fan wmi video button
[Fri Sep  3 09:05:56 2021] CPU: 7 PID: 284 Comm: kworker/7:1H Tainted:
G        W         5.10.0-8-amd64 #1 Debian 5.10.46-4
[Fri Sep  3 09:05:56 2021] Hardware name: Gigabyte Technology Co.,
Ltd. Z87X-UD4H/Z87X-UD4H-CF, BIOS F9 03/18/2014
[Fri Sep  3 09:05:56 2021] Workqueue: kblockd blk_mq_run_work_fn
[Fri Sep  3 09:05:56 2021] RIP: 0010:__domain_mapping.cold+0x4e/0x55
[Fri Sep  3 09:05:56 2021] Code: 68 d6 fd ff 8b 05 bd 97 f2 00 4c 8b
0c 24 4c 8b 5c 24 08 4c 8b 54 24 10 85 c0 4c 8b 44 24 18 74 09 83 e8
01 89 05 9d 97 f2 00 <
0f> 0b e9 4e 29 d6 ff 4c 89 ea 4c 89 f6 48 c7 c7 70 dd b4 b0 e8 29
[Fri Sep  3 09:05:56 2021] RSP: 0018:ffffa62d80787b28 EFLAGS: 00010246
[Fri Sep  3 09:05:56 2021] RAX: 0000000000000000 RBX: 00000001720a0003
RCX: 0000000000000000
[Fri Sep  3 09:05:56 2021] RDX: 0000000000000000 RSI: ffff968f1f3d8a00
RDI: ffff968f1f3d8a00
[Fri Sep  3 09:05:56 2021] RBP: ffff96883feee068 R08: 0000000000000073
R09: ffff96883feee000
[Fri Sep  3 09:05:56 2021] R10: 00000000001720a0 R11: ffff968819832100
R12: 0000000000000001
[Fri Sep  3 09:05:56 2021] R13: ffff9689b51af180 R14: 00000000000df60d
R15: 0000000000000003
[Fri Sep  3 09:05:56 2021] FS:  0000000000000000(0000)
GS:ffff968f1f3c0000(0000) knlGS:0000000000000000
[Fri Sep  3 09:05:56 2021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Sep  3 09:05:56 2021] CR2: 0000562a47c025b8 CR3: 000000018a00a002
CR4: 00000000001706e0
[Fri Sep  3 09:05:56 2021] Call Trace:
[Fri Sep  3 09:05:56 2021]  domain_mapping+0x1b/0xa0
[Fri Sep  3 09:05:56 2021]  intel_map_sg+0x120/0x220
[Fri Sep  3 09:05:56 2021]  dma_map_sg_attrs+0x45/0x50
[Fri Sep  3 09:05:56 2021]  scsi_dma_map+0x35/0x40 [scsi_mod]
[Fri Sep  3 09:05:56 2021]  _base_build_sg_scmd+0x68/0x2d0 [mpt3sas]
[Fri Sep  3 09:05:56 2021]  scsih_qcmd+0x38c/0x4b0 [mpt3sas]
[Fri Sep  3 09:05:56 2021]  scsi_queue_rq+0x31f/0xa60 [scsi_mod]
[Fri Sep  3 09:05:56 2021]  blk_mq_dispatch_rq_list+0x11a/0x7f0
[Fri Sep  3 09:05:56 2021]  ? asm_exc_xen_hypervisor_callback+0x20/0x20
[Fri Sep  3 09:05:56 2021]  ? elv_rb_del+0x1f/0x30
[Fri Sep  3 09:05:56 2021]  ? deadline_remove_request+0x55/0xb0
[Fri Sep  3 09:05:56 2021]  __blk_mq_do_dispatch_sched+0xb8/0x2c0
[Fri Sep  3 09:05:56 2021]  __blk_mq_sched_dispatch_requests+0x13f/0x190
[Fri Sep  3 09:05:56 2021]  blk_mq_sched_dispatch_requests+0x30/0x60
[Fri Sep  3 09:05:56 2021]  __blk_mq_run_hw_queue+0x47/0xe0
[Fri Sep  3 09:05:56 2021]  process_one_work+0x1b6/0x350
[Fri Sep  3 09:05:56 2021]  worker_thread+0x53/0x3e0
[Fri Sep  3 09:05:56 2021]  ? process_one_work+0x350/0x350
[Fri Sep  3 09:05:56 2021]  kthread+0x11b/0x140
[Fri Sep  3 09:05:56 2021]  ? __kthread_bind_mask+0x60/0x60
[Fri Sep  3 09:05:56 2021]  ret_from_fork+0x22/0x30
[Fri Sep  3 09:05:56 2021] ---[ end trace 3b8b2da3617ea49e ]---
[Fri Sep  3 09:05:56 2021] DMAR: DRHD: handling fault status reg 3
[Fri Sep  3 09:05:56 2021] DMAR: [DMA Write] Request device [01:00.0]
PASID ffffffff fault addr df60d000 [fault reason 05] PTE Write access
is not set
[Fri Sep  3 09:05:56 2021] DMAR: ERROR: DMA PTE for vPFN 0xdf60d
already set (to 1000000000000 not 1a4388003)
[Fri Sep  3 09:05:56 2021] ------------[ cut here ]------------

[Fri Sep  3 06:40:40 2021] md: resync of RAID array md127
[Fri Sep  3 07:30:00 2021] rcu: INFO: rcu_sched self-detected stall on CPU
[Fri Sep  3 07:30:00 2021] rcu:         2-....: (5249 ticks this GP)
idle=0c2/1/0x4000000000000000 softirq=56412/56412 fqs=2624
[Fri Sep  3 07:30:00 2021]      (t=5250 jiffies g=91613 q=8034)
[Fri Sep  3 07:30:00 2021] NMI backtrace for cpu 2
[Fri Sep  3 07:30:00 2021] CPU: 2 PID: 364 Comm: md127_raid6 Not
tainted 5.10.0-8-amd64 #1 Debian 5.10.46-4
[Fri Sep  3 07:30:00 2021] Hardware name: Gigabyte Technology Co.,
Ltd. Z87X-UD4H/Z87X-UD4H-CF, BIOS F9 03/18/2014
[Fri Sep  3 07:30:00 2021] Call Trace:
[Fri Sep  3 07:30:00 2021]  <IRQ>
[Fri Sep  3 07:30:00 2021]  dump_stack+0x6b/0x83
[Fri Sep  3 07:30:00 2021]  nmi_cpu_backtrace.cold+0x32/0x69
[Fri Sep  3 07:30:00 2021]  ? lapic_can_unplug_cpu+0x80/0x80
[Fri Sep  3 07:30:00 2021]  nmi_trigger_cpumask_backtrace+0xd7/0xe0
[Fri Sep  3 07:30:00 2021]  rcu_dump_cpu_stacks+0xa2/0xd0
[Fri Sep  3 07:30:00 2021]  rcu_sched_clock_irq.cold+0x1ff/0x3d6
[Fri Sep  3 07:30:00 2021]  update_process_times+0x8c/0xc0
[Fri Sep  3 07:30:00 2021]  tick_sched_handle+0x22/0x60
[Fri Sep  3 07:30:00 2021]  tick_sched_timer+0x7c/0xb0
[Fri Sep  3 07:30:00 2021]  ? tick_do_update_jiffies64.part.0+0xc0/0xc0
[Fri Sep  3 07:30:00 2021]  __hrtimer_run_queues+0x12a/0x270
[Fri Sep  3 07:30:00 2021]  hrtimer_interrupt+0x110/0x2c0
[Fri Sep  3 07:30:00 2021]  __sysvec_apic_timer_interrupt+0x5f/0xd0
[Fri Sep  3 07:30:00 2021]  asm_call_irq_on_stack+0x12/0x20
[Fri Sep  3 07:30:00 2021]  </IRQ>
[Fri Sep  3 07:30:00 2021]  sysvec_apic_timer_interrupt+0x72/0x80
[Fri Sep  3 07:30:00 2021]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[Fri Sep  3 07:30:00 2021] RIP:
0010:native_queued_spin_lock_slowpath+0x1a2/0x1d0
[Fri Sep  3 07:30:00 2021] Code: 83 e0 03 83 ee 01 48 c1 e0 05 48 63
f6 48 05 80 c8 02 00 48 03 04 f5 00 89 97 a9 48 89 10 8b 42 08 85 c0
75 09 f3 90 8b 42 08
[Fri Sep  3 07:30:00 2021] RSP: 0018:ffffa90180533c30 EFLAGS: 00000246
[Fri Sep  3 07:30:00 2021] RAX: 0000000000000000 RBX: ffff89241ed21500
RCX: 00000000000c0000
[Fri Sep  3 07:30:00 2021] RDX: ffff892adf2ac880 RSI: 0000000000000003
RDI: ffff89241ed21568
[Fri Sep  3 07:30:00 2021] RBP: ffff8923e22fc800 R08: 00000000000c0000
R09: 00000000fffffffd
[Fri Sep  3 07:30:00 2021] R10: 0000000000000000 R11: 0000000000000000
R12: ffff89241ed21568
[Fri Sep  3 07:30:00 2021] R13: ffffa90180533dc0 R14: ffff8923e22fc800
R15: 0000000000000000
[Fri Sep  3 07:30:00 2021]  _raw_spin_lock+0x1a/0x20
[Fri Sep  3 07:30:00 2021]  handle_stripe+0x57c/0x1fd0 [raid456]
[Fri Sep  3 07:30:00 2021]
handle_active_stripes.constprop.0+0x3af/0x590 [raid456]
[Fri Sep  3 07:30:00 2021]  raid5d+0x375/0x5a0 [raid456]
[Fri Sep  3 07:30:00 2021]  md_thread+0x94/0x160 [md_mod]
[Fri Sep  3 07:30:00 2021]  ? add_wait_queue_exclusive+0x70/0x70
[Fri Sep  3 07:30:00 2021]  ? md_write_inc+0x50/0x50 [md_mod]
[Fri Sep  3 07:30:00 2021]  kthread+0x11b/0x140
[Fri Sep  3 07:30:00 2021]  ? __kthread_bind_mask+0x60/0x60
[Fri Sep  3 07:30:00 2021]  ret_from_fork+0x22/0x30
[Fri Sep  3 07:30:25 2021] watchdog: BUG: soft lockup - CPU#2 stuck
for 22s! [md127_raid6:364]
[Fri Sep  3 07:30:25 2021] Modules linked in: ipt_REJECT
nf_reject_ipv4 xt_multiport nft_compat nft_counter nf_tables nfnetlink
rfkill nls_ascii nls_cp437 vfa
_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg soundwire_intel
intel_rapl_common soundwire_generic_allocation snd_soc_core
x86_pkg_temp_thermal snd_compress s
m_intel snd_pcm kvm snd_timer snd irqbypass iTCO_wdt intel_pmc_bxt
soundcore mei_me mei rapl sg iTCO_vendor_support watchdog intel_cstate
intel_uncore pcspkr
 jbd2 btrfs blake2b_generic dm_crypt dm_mod raid10 raid1 raid0
multipath linear raid456 async_raid6_recov async_memcpy async_pq
async_xor async_tx xor raid6_p
hid
[Fri Sep  3 07:30:25 2021]  i915 crct10dif_pclmul crct10dif_common
crc32_pclmul crc32c_intel ghash_clmulni_intel i2c_algo_bit
drm_kms_helper xhci_pci cec xhci
bata raid_class cryptd glue_helper usbcore scsi_transport_sas i2c_i801
i2c_smbus scsi_mod ptp lpc_ich pps_core usb_common fan wmi video
button
[Fri Sep  3 07:30:25 2021] CPU: 2 PID: 364 Comm: md127_raid6 Not
tainted 5.10.0-8-amd64 #1 Debian 5.10.46-4
[Fri Sep  3 07:30:25 2021] Hardware name: Gigabyte Technology Co.,
Ltd. Z87X-UD4H/Z87X-UD4H-CF, BIOS F9 03/18/2014
[Fri Sep  3 07:30:25 2021] RIP:
0010:native_queued_spin_lock_slowpath+0x1a2/0x1d0
[Fri Sep  3 07:30:25 2021] Code: 83 e0 03 83 ee 01 48 c1 e0 05 48 63
f6 48 05 80 c8 02 00 48 03 04 f5 00 89 97 a9 48 89 10 8b 42 08 85 c0
75 09 f3 90 8b 42 08
[Fri Sep  3 07:30:25 2021] RSP: 0018:ffffa90180533c30 EFLAGS: 00000246
[Fri Sep  3 07:30:25 2021] RAX: 0000000000000000 RBX: ffff89241ed21500
RCX: 00000000000c0000
[Fri Sep  3 07:30:25 2021] RDX: ffff892adf2ac880 RSI: 0000000000000003
RDI: ffff89241ed21568
[Fri Sep  3 07:30:25 2021] RBP: ffff8923e22fc800 R08: 00000000000c0000
R09: 00000000fffffffd
[Fri Sep  3 07:30:25 2021] R10: 0000000000000000 R11: 0000000000000000
R12: ffff89241ed21568
[Fri Sep  3 07:30:25 2021] R13: ffffa90180533dc0 R14: ffff8923e22fc800
R15: 0000000000000000
[Fri Sep  3 07:30:25 2021] FS:  0000000000000000(0000)
GS:ffff892adf280000(0000) knlGS:0000000000000000
[Fri Sep  3 07:30:25 2021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Sep  3 07:30:25 2021] CR2: 000055b5eb31cd98 CR3: 00000002c060a003
CR4: 00000000001706e0
[Fri Sep  3 07:30:25 2021] Call Trace:
[Fri Sep  3 07:30:25 2021]  _raw_spin_lock+0x1a/0x20
[Fri Sep  3 07:30:25 2021]  handle_stripe+0x57c/0x1fd0 [raid456]
[Fri Sep  3 07:30:25 2021]
handle_active_stripes.constprop.0+0x3af/0x590 [raid456]
[Fri Sep  3 07:30:25 2021]  raid5d+0x375/0x5a0 [raid456]
[Fri Sep  3 07:30:25 2021]  md_thread+0x94/0x160 [md_mod]
[Fri Sep  3 07:30:25 2021]  ? add_wait_queue_exclusive+0x70/0x70
[Fri Sep  3 07:30:25 2021]  ? md_write_inc+0x50/0x50 [md_mod]
[Fri Sep  3 07:30:25 2021]  kthread+0x11b/0x140
[Fri Sep  3 07:30:25 2021]  ? __kthread_bind_mask+0x60/0x60
[Fri Sep  3 07:30:25 2021]  ret_from_fork+0x22/0x30
[Fri Sep  3 07:30:53 2021] watchdog: BUG: soft lockup - CPU#2 stuck
for 22s! [md127_raid6:364]
[Fri Sep  3 07:30:53 2021] Modules linked in: ipt_REJECT
nf_reject_ipv4 xt_multiport nft_compat nft_counter nf_tables nfnetlink
rfkill nls_ascii nls_cp437 vfa
_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg soundwire_intel
intel_rapl_common soundwire_generic_allocation snd_soc_core
x86_pkg_temp_thermal snd_compress s
m_intel snd_pcm kvm snd_timer snd irqbypass iTCO_wdt intel_pmc_bxt
soundcore mei_me mei rapl sg iTCO_vendor_support watchdog intel_cstate
intel_uncore pcspkr
 jbd2 btrfs blake2b_generic dm_crypt dm_mod raid10 raid1 raid0
multipath linear raid456 async_raid6_recov async_memcpy async_pq
async_xor async_tx xor raid6_p
hid
[Fri Sep  3 07:30:53 2021]  i915 crct10dif_pclmul crct10dif_common
crc32_pclmul crc32c_intel ghash_clmulni_intel i2c_algo_bit
drm_kms_helper xhci_pci cec xhci
bata raid_class cryptd glue_helper usbcore scsi_transport_sas i2c_i801
i2c_smbus scsi_mod ptp lpc_ich pps_core usb_common fan wmi video
button
[Fri Sep  3 07:30:53 2021] CPU: 2 PID: 364 Comm: md127_raid6 Tainted:
G             L    5.10.0-8-amd64 #1 Debian 5.10.46-4
[Fri Sep  3 07:30:53 2021] Hardware name: Gigabyte Technology Co.,
Ltd. Z87X-UD4H/Z87X-UD4H-CF, BIOS F9 03/18/2014
[Fri Sep  3 07:30:53 2021] RIP:
0010:native_queued_spin_lock_slowpath+0x1a2/0x1d0
[Fri Sep  3 07:30:53 2021] Code: 83 e0 03 83 ee 01 48 c1 e0 05 48 63
f6 48 05 80 c8 02 00 48 03 04 f5 00 89 97 a9 48 89 10 8b 42 08 85 c0
75 09 f3 90 8b 42 08
[Fri Sep  3 07:30:53 2021] RSP: 0018:ffffa90180533c30 EFLAGS: 00000246
[Fri Sep  3 07:30:53 2021] RAX: 0000000000000000 RBX: ffff89241ed21500
RCX: 00000000000c0000
[Fri Sep  3 07:30:53 2021] RDX: ffff892adf2ac880 RSI: 0000000000000003
RDI: ffff89241ed21568
[Fri Sep  3 07:30:53 2021] RBP: ffff8923e22fc800 R08: 00000000000c0000
R09: 00000000fffffffd
[Fri Sep  3 07:30:53 2021] R10: 0000000000000000 R11: 0000000000000000
R12: ffff89241ed21568
[Fri Sep  3 07:30:53 2021] R13: ffffa90180533dc0 R14: ffff8923e22fc800
R15: 0000000000000000
[Fri Sep  3 07:30:53 2021] FS:  0000000000000000(0000)
GS:ffff892adf280000(0000) knlGS:0000000000000000
[Fri Sep  3 07:30:53 2021] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Sep  3 07:30:53 2021] CR2: 000055b5eb31cd98 CR3: 00000002c060a003
CR4: 00000000001706e0
[Fri Sep  3 07:30:53 2021] Call Trace:
[Fri Sep  3 07:30:53 2021]  _raw_spin_lock+0x1a/0x20
[Fri Sep  3 07:30:53 2021]  handle_stripe+0x57c/0x1fd0 [raid456]
[Fri Sep  3 07:30:53 2021]
handle_active_stripes.constprop.0+0x3af/0x590 [raid456]
[Fri Sep  3 07:30:53 2021]  raid5d+0x375/0x5a0 [raid456]
[Fri Sep  3 07:30:53 2021]  md_thread+0x94/0x160 [md_mod]
[Fri Sep  3 07:30:53 2021]  ? add_wait_queue_exclusive+0x70/0x70
[Fri Sep  3 07:30:53 2021]  ? md_write_inc+0x50/0x50 [md_mod]
[Fri Sep  3 07:30:53 2021]  kthread+0x11b/0x140
[Fri Sep  3 07:30:53 2021]  ? __kthread_bind_mask+0x60/0x60
[Fri Sep  3 07:30:53 2021]  ret_from_fork+0x22/0x30
[Fri Sep  3 07:31:03 2021] rcu: INFO: rcu_sched self-detected stall on CPU
[Fri Sep  3 07:31:03 2021] rcu:         2-....: (21002 ticks this GP)
idle=0c2/1/0x4000000000000000 softirq=56412/56412 fqs=10500
[Fri Sep  3 07:31:03 2021]      (t=21003 jiffies g=91613 q=27314)
[Fri Sep  3 07:31:03 2021] NMI backtrace for cpu 2
[Fri Sep  3 07:31:03 2021] CPU: 2 PID: 364 Comm: md127_raid6 Tainted:
G             L    5.10.0-8-amd64 #1 Debian 5.10.46-4
[Fri Sep  3 07:31:03 2021] Hardware name: Gigabyte Technology Co.,
Ltd. Z87X-UD4H/Z87X-UD4H-CF, BIOS F9 03/18/2014
[Fri Sep  3 07:31:03 2021] Call Trace:
[Fri Sep  3 07:31:03 2021]  <IRQ>
[Fri Sep  3 07:31:03 2021]  dump_stack+0x6b/0x83
[Fri Sep  3 07:31:03 2021]  nmi_cpu_backtrace.cold+0x32/0x69
[Fri Sep  3 07:31:03 2021]  ? lapic_can_unplug_cpu+0x80/0x80
[Fri Sep  3 07:31:03 2021]  nmi_trigger_cpumask_backtrace+0xd7/0xe0
[Fri Sep  3 07:31:03 2021]  rcu_dump_cpu_stacks+0xa2/0xd0
[Fri Sep  3 07:31:03 2021]  rcu_sched_clock_irq.cold+0x1ff/0x3d6
[Fri Sep  3 07:31:03 2021]  update_process_times+0x8c/0xc0
[Fri Sep  3 07:31:03 2021]  tick_sched_handle+0x22/0x60
[Fri Sep  3 07:31:03 2021]  tick_sched_timer+0x7c/0xb0
[Fri Sep  3 07:31:03 2021]  ? tick_do_update_jiffies64.part.0+0xc0/0xc0
[Fri Sep  3 07:31:03 2021]  __hrtimer_run_queues+0x12a/0x270
[Fri Sep  3 07:31:03 2021]  hrtimer_interrupt+0x110/0x2c0
[Fri Sep  3 07:31:03 2021]  __sysvec_apic_timer_interrupt+0x5f/0xd0
[Fri Sep  3 07:31:03 2021]  asm_call_irq_on_stack+0x12/0x20
[Fri Sep  3 07:31:03 2021]  </IRQ>
[Fri Sep  3 07:31:03 2021]  sysvec_apic_timer_interrupt+0x72/0x80
[Fri Sep  3 07:31:03 2021]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[Fri Sep  3 07:31:03 2021] RIP:
0010:native_queued_spin_lock_slowpath+0x1a2/0x1d0
[Fri Sep  3 07:31:03 2021] Code: 83 e0 03 83 ee 01 48 c1 e0 05 48 63
f6 48 05 80 c8 02 00 48 03 04 f5 00 89 97 a9 48 89 10 8b 42 08 85 c0
75 09 f3 90 8b 42 08
[Fri Sep  3 07:31:03 2021] RSP: 0018:ffffa90180533c30 EFLAGS: 00000246
[Fri Sep  3 07:31:03 2021] RAX: 0000000000000000 RBX: ffff89241ed21500
RCX: 00000000000c0000
[Fri Sep  3 07:31:03 2021] RDX: ffff892adf2ac880 RSI: 0000000000000003
RDI: ffff89241ed21568
[Fri Sep  3 07:31:03 2021] RBP: ffff8923e22fc800 R08: 00000000000c0000
R09: 00000000fffffffd
[Fri Sep  3 07:31:03 2021] R10: 0000000000000000 R11: 0000000000000000
R12: ffff89241ed21568
[Fri Sep  3 07:31:03 2021] R13: ffffa90180533dc0 R14: ffff8923e22fc800
R15: 0000000000000000
[Fri Sep  3 07:31:03 2021]  _raw_spin_lock+0x1a/0x20
[Fri Sep  3 07:31:03 2021]  handle_stripe+0x57c/0x1fd0 [raid456]
[Fri Sep  3 07:31:03 2021]
handle_active_stripes.constprop.0+0x3af/0x590 [raid456]
[Fri Sep  3 07:31:03 2021]  raid5d+0x375/0x5a0 [raid456]
[Fri Sep  3 07:31:03 2021]  md_thread+0x94/0x160 [md_mod]
[Fri Sep  3 07:31:03 2021]  ? add_wait_queue_exclusive+0x70/0x70
[Fri Sep  3 07:31:03 2021]  ? md_write_inc+0x50/0x50 [md_mod]
[Fri Sep  3 07:31:03 2021]  kthread+0x11b/0x140
[Fri Sep  3 07:31:03 2021]  ? __kthread_bind_mask+0x60/0x60
[Fri Sep  3 07:31:03 2021]  ret_from_fork+0x22/0x30
[Fri Sep  3 07:31:29 2021] watchdog: BUG: soft lockup - CPU#2 stuck
for 23s! [md127_raid6:364]
[Fri Sep  3 07:31:29 2021] Modules linked in: ipt_REJECT
nf_reject_ipv4 xt_multiport nft_compat nft_counter nf_tables nfnetlink
rfkill nls_ascii nls_cp437 vfa
_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg soundwire_intel
intel_rapl_common soundwire_generic_allocation snd_soc_core
x86_pkg_temp_thermal snd_compress s
m_intel snd_pcm kvm snd_timer snd irqbypass iTCO_wdt intel_pmc_bxt
soundcore mei_me mei rapl sg iTCO_vendor_support watchdog intel_cstate
intel_uncore pcspkr
 jbd2 btrfs blake2b_generic dm_crypt dm_mod raid10 raid1 raid0
multipath linear raid456 async_raid6_recov async_memcpy async_pq
async_xor async_tx xor raid6_p
hid
[Fri Sep  3 07:31:29 2021]  i915 crct10dif_pclmul crct10dif_common
crc32_pclmul crc32c_intel ghash_clmulni_intel i2c_algo_bit
drm_kms_helper xhci_pci cec xhci
bata raid_class cryptd glue_helper usbcore scsi_transport_sas i2c_i801
i2c_smbus scsi_mod ptp lpc_ich pps_core usb_common fan wmi video
button
[Fri Sep  3 07:31:29 2021] CPU: 2 PID: 364 Comm: md127_raid6 Tainted:
G             L    5.10.0-8-amd64 #1 Debian 5.10.46-4

_____________
Ryan Patterson
May the wings of liberty never lose a feather.



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux