[Bug 217572] Initial blocked tasks causing deterioration over hours until (nearly) complete system lockup and data loss with PostgreSQL 13

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://bugzilla.kernel.org/show_bug.cgi?id=217572

--- Comment #17 from Ivan Mironov (mironov.ivan@xxxxxxxxx) ---
More of it:

[    0.000000] Linux version 6.5.5-200.fc38.x86_64
(mockbuild@d4d01d62c9c942e59de1ef4aa94df5a2) (gcc (GCC) 13.2.1 20230728 (Red
Hat 13.2.1-1), GNU ld version 2.39-9.fc38) #1 SMP PREEMPT_DYNAMIC Sun Sep 24
15:52:44 UTC 2023
[    0.000000] Command line: BOOT_IMAGE=(md/boot)/vmlinuz-6.5.5-200.fc38.x86_64
root=/dev/mapper/vg--bmsolv-root ro
rd.md.uuid=216337a3:789c28b0:81fbad29:6f190e56 rd.lvm.lv=vg-bmsolv/root
rd.md.uuid=252001b9:2095e731:f1dd5baa:8b672d56 clocksource=tsc tsc=reliable
amd_pstate=active rhgb quiet
...
[13024.332817] watchdog: BUG: soft lockup - CPU#8 stuck for 26s!
[rocksdb:low:6331]
[13024.332841] Modules linked in: nft_masq nft_fib_inet nft_fib_ipv4
nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject
wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel
nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill
tcp_bbr ip_set nf_tables nfnetlink tun nct6775 nct6775_core hwmon_vid ipmi_ssif
vfat fat intel_rapl_msr intel_rapl_common snd_hda_intel edac_mce_amd
snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm_amd snd_hda_core kvm
snd_hwdep snd_pcm snd_timer irqbypass acpi_ipmi rapl snd wmi_bmof cdc_ether
ipmi_si usbnet soundcore ipmi_devintf mii k10temp i2c_piix4 ipmi_msghandler
joydev fuse loop xfs raid1 igb ast dca i2c_algo_bit crct10dif_pclmul
crc32_pclmul crc32c_intel nvme polyval_clmulni polyval_generic nvme_core
ghash_clmulni_intel ccp sha512_ssse3 wmi sp5100_tco nvme_common
[13024.332904] CPU: 8 PID: 6331 Comm: rocksdb:low Not tainted
6.5.5-200.fc38.x86_64 #1
[13024.332906] Hardware name: To Be Filled By O.E.M. To Be Filled By
O.E.M./X570D4U, BIOS P1.20 05/19/2021
[13024.332908] RIP: 0010:xas_load+0x45/0x50
[13024.332914] Code: 3d 00 10 00 00 77 07 5b 5d e9 77 77 02 00 0f b6 4b 10 48
8d 68 fe 38 48 fe 72 ec 48 89 ee 48 89 df e8 af fd ff ff 80 7d 00 00 <75> c7 eb
d9 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90
[13024.332917] RSP: 0018:ffff9c6541a1fbc0 EFLAGS: 00000206
[13024.332919] RAX: ffff8ee4773b8b6a RBX: ffff9c6541a1fbd8 RCX:
0000000000000002
[13024.332921] RDX: 0000000000000000 RSI: ffff8ed9bb019ff0 RDI:
ffff9c6541a1fbd8
[13024.332923] RBP: ffff8ed9bb019ff0 R08: 0000000000000000 R09:
000000000000131c
[13024.332924] R10: 0000000000000000 R11: ffff8ed9e3a9f538 R12:
0000000000009010
[13024.332925] R13: ffff8eee64944900 R14: 000000000000900f R15:
ffff9c6541a1fe70
[13024.332927] FS:  00007f32871ff6c0(0000) GS:ffff8ef8fec00000(0000)
knlGS:0000000000000000
[13024.332929] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[13024.332931] CR2: 00007ef3cac01000 CR3: 0000000101d7e000 CR4:
0000000000750ee0
[13024.332933] PKRU: 55555554
[13024.332934] Call Trace:
[13024.332937]  <IRQ>
[13024.332941]  ? watchdog_timer_fn+0x1b8/0x220
[13024.332946]  ? __pfx_watchdog_timer_fn+0x10/0x10
[13024.332949]  ? __hrtimer_run_queues+0x112/0x2b0
[13024.332954]  ? hrtimer_interrupt+0xf8/0x230
[13024.332957]  ? __sysvec_apic_timer_interrupt+0x61/0x130
[13024.332961]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[13024.332965]  </IRQ>
[13024.332966]  <TASK>
[13024.332968]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[13024.332974]  ? xas_load+0x45/0x50
[13024.332976]  filemap_get_read_batch+0x16e/0x250
[13024.332981]  filemap_get_pages+0xa6/0x630
[13024.332984]  ? srso_alias_return_thunk+0x5/0x7f
[13024.332988]  ? srso_alias_return_thunk+0x5/0x7f
[13024.332990]  ? touch_atime+0x48/0x1b0
[13024.332994]  ? srso_alias_return_thunk+0x5/0x7f
[13024.332996]  ? filemap_read+0x329/0x350
[13024.332999]  filemap_read+0xd9/0x350
[13024.333005]  xfs_file_buffered_read+0x52/0xd0 [xfs]
[13024.333107]  xfs_file_read_iter+0x77/0xe0 [xfs]
[13024.333216]  vfs_read+0x201/0x350
[13024.333225]  __x64_sys_pread64+0x98/0xd0
[13024.333229]  do_syscall_64+0x60/0x90
[13024.333232]  ? srso_alias_return_thunk+0x5/0x7f
[13024.333236]  ? __irq_exit_rcu+0x4b/0xc0
[13024.333241]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[13024.333245] RIP: 0033:0x7f32aa721105
[13024.333264] Code: e8 48 89 75 f0 89 7d f8 48 89 4d e0 e8 d4 99 f8 ff 4c 8b
55 e0 48 8b 55 e8 41 89 c0 48 8b 75 f0 8b 7d f8 b8 11 00 00 00 0f 05 <48> 3d 00
f0 ff ff 77 2b 44 89 c7 48 89 45 f8 e8 27 9a f8 ff 48 8b
[13024.333266] RSP: 002b:00007f32871f90d0 EFLAGS: 00000293 ORIG_RAX:
0000000000000011
[13024.333269] RAX: ffffffffffffffda RBX: 00007f32871f9220 RCX:
00007f32aa721105
[13024.333270] RDX: 000000000000131c RSI: 00007f3283e6fc00 RDI:
00000000000005c9
[13024.333272] RBP: 00007f32871f90f0 R08: 0000000000000000 R09:
00007f32871f9268
[13024.333273] R10: 000000000900f060 R11: 0000000000000293 R12:
000000000900f060
[13024.333275] R13: 000000000000131c R14: 00007f3283e6fc00 R15:
00007f3299010580
[13024.333280]  </TASK>
[13052.332283] watchdog: BUG: soft lockup - CPU#8 stuck for 52s!
[rocksdb:low:6331]
[13052.332303] Modules linked in: nft_masq nft_fib_inet nft_fib_ipv4
nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject
wireguard curve25519_x86_64 libcurve25519_generic ip6_udp_tunnel udp_tunnel
nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill
tcp_bbr ip_set nf_tables nfnetlink tun nct6775 nct6775_core hwmon_vid ipmi_ssif
vfat fat intel_rapl_msr intel_rapl_common snd_hda_intel edac_mce_amd
snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm_amd snd_hda_core kvm
snd_hwdep snd_pcm snd_timer irqbypass acpi_ipmi rapl snd wmi_bmof cdc_ether
ipmi_si usbnet soundcore ipmi_devintf mii k10temp i2c_piix4 ipmi_msghandler
joydev fuse loop xfs raid1 igb ast dca i2c_algo_bit crct10dif_pclmul
crc32_pclmul crc32c_intel nvme polyval_clmulni polyval_generic nvme_core
ghash_clmulni_intel ccp sha512_ssse3 wmi sp5100_tco nvme_common
[13052.332364] CPU: 8 PID: 6331 Comm: rocksdb:low Tainted: G             L    
6.5.5-200.fc38.x86_64 #1
[13052.332367] Hardware name: To Be Filled By O.E.M. To Be Filled By
O.E.M./X570D4U, BIOS P1.20 05/19/2021
[13052.332368] RIP: 0010:xas_descend+0x3/0x90
[13052.332374] Code: 00 48 8b 57 10 48 89 07 48 c1 e8 20 48 89 57 08 e9 c2 79
02 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f b6 0e <48> 8b 57
08 48 d3 ea 83 e2 3f 89 d0 48 83 c0 04 48 8b 44 c6 08 48
[13052.332375] RSP: 0018:ffff9c6541a1fbb8 EFLAGS: 00000206
[13052.332377] RAX: ffff8ed9e8c56da2 RBX: ffff9c6541a1fbd8 RCX:
000000000000000c
[13052.332378] RDX: 0000000000000002 RSI: ffff8ed9e8c56da0 RDI:
ffff9c6541a1fbd8
[13052.332379] RBP: ffff8ed9e8c56da0 R08: 0000000000000000 R09:
000000000000131c
[13052.332380] R10: 0000000000000000 R11: ffff8ed9e3a9f538 R12:
0000000000009010
[13052.332381] R13: ffff8eee64944900 R14: 000000000000900f R15:
ffff9c6541a1fe70
[13052.332383] FS:  00007f32871ff6c0(0000) GS:ffff8ef8fec00000(0000)
knlGS:0000000000000000
[13052.332384] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[13052.332385] CR2: 00007ef3cac01000 CR3: 0000000101d7e000 CR4:
0000000000750ee0
[13052.332387] PKRU: 55555554
[13052.332387] Call Trace:
[13052.332389]  <IRQ>
[13052.332391]  ? watchdog_timer_fn+0x1b8/0x220
[13052.332395]  ? __pfx_watchdog_timer_fn+0x10/0x10
[13052.332398]  ? __hrtimer_run_queues+0x112/0x2b0
[13052.332402]  ? hrtimer_interrupt+0xf8/0x230
[13052.332404]  ? __sysvec_apic_timer_interrupt+0x61/0x130
[13052.332407]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[13052.332410]  </IRQ>
[13052.332411]  <TASK>
[13052.332412]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[13052.332418]  ? xas_descend+0x3/0x90
[13052.332420]  ? srso_alias_return_thunk+0x5/0x7f
[13052.332423]  xas_load+0x41/0x50
[13052.332426]  filemap_get_read_batch+0x16e/0x250
[13052.332431]  filemap_get_pages+0xa6/0x630
[13052.332433]  ? srso_alias_return_thunk+0x5/0x7f
[13052.332436]  ? srso_alias_return_thunk+0x5/0x7f
[13052.332438]  ? touch_atime+0x48/0x1b0
[13052.332441]  ? srso_alias_return_thunk+0x5/0x7f
[13052.332443]  ? filemap_read+0x329/0x350
[13052.332445]  filemap_read+0xd9/0x350
[13052.332451]  xfs_file_buffered_read+0x52/0xd0 [xfs]
[13052.332548]  xfs_file_read_iter+0x77/0xe0 [xfs]
[13052.332633]  vfs_read+0x201/0x350
[13052.332641]  __x64_sys_pread64+0x98/0xd0
[13052.332643]  do_syscall_64+0x60/0x90
[13052.332646]  ? srso_alias_return_thunk+0x5/0x7f
[13052.332649]  ? __irq_exit_rcu+0x4b/0xc0
[13052.332652]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[13052.332655] RIP: 0033:0x7f32aa721105
[13052.332671] Code: e8 48 89 75 f0 89 7d f8 48 89 4d e0 e8 d4 99 f8 ff 4c 8b
55 e0 48 8b 55 e8 41 89 c0 48 8b 75 f0 8b 7d f8 b8 11 00 00 00 0f 05 <48> 3d 00
f0 ff ff 77 2b 44 89 c7 48 89 45 f8 e8 27 9a f8 ff 48 8b
[13052.332673] RSP: 002b:00007f32871f90d0 EFLAGS: 00000293 ORIG_RAX:
0000000000000011
[13052.332675] RAX: ffffffffffffffda RBX: 00007f32871f9220 RCX:
00007f32aa721105
[13052.332676] RDX: 000000000000131c RSI: 00007f3283e6fc00 RDI:
00000000000005c9
[13052.332677] RBP: 00007f32871f90f0 R08: 0000000000000000 R09:
00007f32871f9268
[13052.332678] R10: 000000000900f060 R11: 0000000000000293 R12:
000000000900f060
[13052.332679] R13: 000000000000131c R14: 00007f3283e6fc00 R15:
00007f3299010580
[13052.332683]  </TASK>
[13059.632827] rcu: INFO: rcu_preempt self-detected stall on CPU
[13059.632832] rcu:     8-....: (60000 ticks this GP)
idle=5d8c/1/0x4000000000000000 softirq=3001365/3001365 fqs=27154
[13059.632836] rcu:     (t=60001 jiffies g=4535761 q=1170639 ncpus=32)
[13059.632838] CPU: 8 PID: 6331 Comm: rocksdb:low Tainted: G             L    
6.5.5-200.fc38.x86_64 #1
[13059.632840] Hardware name: To Be Filled By O.E.M. To Be Filled By
O.E.M./X570D4U, BIOS P1.20 05/19/2021
[13059.632841] RIP: 0010:xas_load+0x2d/0x50
[13059.632847] Code: fa 55 53 48 89 fb e8 22 ff ff ff 48 89 c2 83 e2 03 48 83
fa 02 75 08 48 3d 00 10 00 00 77 07 5b 5d e9 77 77 02 00 0f b6 4b 10 <48> 8d 68
fe 38 48 fe 72 ec 48 89 ee 48 89 df e8 af fd ff ff 80 7d
[13059.632849] RSP: 0018:ffff9c6541a1fbc0 EFLAGS: 00000282
[13059.632851] RAX: ffff8ed9bb019ff2 RBX: ffff9c6541a1fbd8 RCX:
0000000000000000
[13059.632852] RDX: 0000000000000002 RSI: ffff8ed9e8c56da0 RDI:
ffff9c6541a1fbd8
[13059.632853] RBP: ffff8ed9e8c56da0 R08: 0000000000000000 R09:
000000000000131c
[13059.632854] R10: 0000000000000000 R11: ffff8ed9e3a9f538 R12:
0000000000009010
[13059.632855] R13: ffff8eee64944900 R14: 000000000000900f R15:
ffff9c6541a1fe70
[13059.632856] FS:  00007f32871ff6c0(0000) GS:ffff8ef8fec00000(0000)
knlGS:0000000000000000
[13059.632858] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[13059.632859] CR2: 00007ef3cac01000 CR3: 0000000101d7e000 CR4:
0000000000750ee0
[13059.632860] PKRU: 55555554
[13059.632861] Call Trace:
[13059.632863]  <IRQ>
[13059.632865]  ? rcu_dump_cpu_stacks+0xc4/0x100
[13059.632869]  ? rcu_sched_clock_irq+0x4f2/0x1170
[13059.632872]  ? srso_alias_return_thunk+0x5/0x7f
[13059.632875]  ? task_tick_fair+0x2fc/0x3f0
[13059.632879]  ? srso_alias_return_thunk+0x5/0x7f
[13059.632881]  ? srso_alias_return_thunk+0x5/0x7f
[13059.632883]  ? trigger_load_balance+0x73/0x390
[13059.632885]  ? srso_alias_return_thunk+0x5/0x7f
[13059.632888]  ? update_process_times+0x74/0xb0
[13059.632891]  ? tick_sched_handle+0x21/0x60
[13059.632894]  ? tick_sched_timer+0x6f/0x90
[13059.632896]  ? __pfx_tick_sched_timer+0x10/0x10
[13059.632897]  ? __hrtimer_run_queues+0x112/0x2b0
[13059.632901]  ? hrtimer_interrupt+0xf8/0x230
[13059.632903]  ? __sysvec_apic_timer_interrupt+0x61/0x130
[13059.632906]  ? sysvec_apic_timer_interrupt+0x6d/0x90
[13059.632909]  </IRQ>
[13059.632910]  <TASK>
[13059.632911]  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
[13059.632916]  ? xas_load+0x2d/0x50
[13059.632918]  filemap_get_read_batch+0x16e/0x250
[13059.632923]  filemap_get_pages+0xa6/0x630
[13059.632925]  ? srso_alias_return_thunk+0x5/0x7f
[13059.632928]  ? srso_alias_return_thunk+0x5/0x7f
[13059.632929]  ? touch_atime+0x48/0x1b0
[13059.632933]  ? srso_alias_return_thunk+0x5/0x7f
[13059.632935]  ? filemap_read+0x329/0x350
[13059.632937]  filemap_read+0xd9/0x350
[13059.632944]  xfs_file_buffered_read+0x52/0xd0 [xfs]
[13059.633041]  xfs_file_read_iter+0x77/0xe0 [xfs]
[13059.633121]  vfs_read+0x201/0x350
[13059.633127]  __x64_sys_pread64+0x98/0xd0
[13059.633130]  do_syscall_64+0x60/0x90
[13059.633133]  ? srso_alias_return_thunk+0x5/0x7f
[13059.633135]  ? __irq_exit_rcu+0x4b/0xc0
[13059.633139]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[13059.633142] RIP: 0033:0x7f32aa721105
[13059.633158] Code: e8 48 89 75 f0 89 7d f8 48 89 4d e0 e8 d4 99 f8 ff 4c 8b
55 e0 48 8b 55 e8 41 89 c0 48 8b 75 f0 8b 7d f8 b8 11 00 00 00 0f 05 <48> 3d 00
f0 ff ff 77 2b 44 89 c7 48 89 45 f8 e8 27 9a f8 ff 48 8b
[13059.633159] RSP: 002b:00007f32871f90d0 EFLAGS: 00000293 ORIG_RAX:
0000000000000011
[13059.633161] RAX: ffffffffffffffda RBX: 00007f32871f9220 RCX:
00007f32aa721105
[13059.633162] RDX: 000000000000131c RSI: 00007f3283e6fc00 RDI:
00000000000005c9
[13059.633163] RBP: 00007f32871f90f0 R08: 0000000000000000 R09:
00007f32871f9268
[13059.633164] R10: 000000000900f060 R11: 0000000000000293 R12:
000000000900f060
[13059.633165] R13: 000000000000131c R14: 00007f3283e6fc00 R15:
00007f3299010580
[13059.633169]  </TASK>

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.



[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux