Hi Marc,
Thanks for reporting this, I generated one patch to fix it. Will send it
out after testing is done.
- Xiubo
On 4/8/24 16:01, Marc Ruhmann wrote:
Hi everyone,
I would like to ask for help regarding client kernel crashes that happen
on cephfs access. We have been struggling with this for over a month now
with over 100 crashes on 7 hosts during that time.
Our cluster runs version 18.2.1. Our clients run CentOS Stream.
On CentOS Stream 9 the problem started with kernel version
5.14.0-425.el9. Version 5.14.0-419.el9 is the last one without problems.
It also occurred on CentOS Stream 8, starting with version
4.18.0-546.el8 (4.18.0-544.el8 being the last good one).
The problem presents itself by the client kernel crashing, forcing a
reboot of the machine. Apparently it is triggered by a certain level of
IO on the cephfs mount. It works perfectly fine when we rollback to the
last good kernel version.
The exact call trace in vmcore-dmesg.txt differs between occurrences.
Here are two typical examples:
```
[ 8641.382499] list_del corruption. next->prev should be
ffff88bd0a4d4c80, but was ffff88bcefdfd280
[ 8641.382521] ------------[ cut here ]------------
[ 8641.382521] kernel BUG at lib/list_debug.c:54!
[ 8641.382528] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 8641.382591] CPU: 2 PID: 83929 Comm: kworker/2:0 Kdump: loaded Not
tainted 5.14.0-432.el9.x86_64 #1
[ 8641.382610] Hardware name: oVirt RHEL/RHEL-AV, BIOS
edk2-20230524-4.el9_3 05/24/2023
[ 8641.382624] Workqueue: ceph-cap ceph_cap_unlink_work [ceph]
[ 8641.382662] RIP: 0010:__list_del_entry_valid.cold+0x1d/0x47
[ 8641.382681] Code: c7 c7 78 42 d8 b1 e8 f9 87 fe ff 0f 0b 48 89 fe
48 c7 c7 08 43 d8 b1 e8 e8 87 fe ff 0f 0b 48 c7 c7 b8 43 d8 b1 e8 da
87 fe ff <0f> 0b 48 89 f2 48 89 fe 48 c7 c7 78 43 d8 b1 e8 c6 87 fe ff
0f 0b
[ 8641.382711] RSP: 0018:ffff95a000d6be60 EFLAGS: 00010246
[ 8641.382722] RAX: 0000000000000054 RBX: ffff88bced76dc00 RCX:
0000000000000000
[ 8641.382734] RDX: 0000000000000000 RSI: ffff88c02eea0840 RDI:
ffff88c02eea0840
[ 8641.382746] RBP: ffff88bd0a4d4c80 R08: 80000000ffff8434 R09:
0000000000ffff10
[ 8641.382758] R10: 000000000000000f R11: 000000000000000f R12:
ffff88c02eeb2800
[ 8641.382779] R13: ffff88bcc4610258 R14: ffff88bcc46101b8 R15:
ffff88bcc46101c8
[ 8641.382793] FS: 0000000000000000(0000) GS:ffff88c02ee80000(0000)
knlGS:0000000000000000
[ 8641.382809] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8641.382819] CR2: 00007f35cee8a000 CR3: 0000000105708004 CR4:
00000000007706e0
[ 8641.382832] PKRU: 55555554
[ 8641.382838] Call Trace:
[ 8641.382844] <TASK>
[ 8641.382850] ? show_trace_log_lvl+0x1c4/0x2df
[ 8641.382860] ? show_trace_log_lvl+0x1c4/0x2df
[ 8641.382870] ? ceph_cap_unlink_work+0x3f/0x140 [ceph]
[ 8641.382893] ? __die_body.cold+0x8/0xd
[ 8641.382902] ? die+0x2b/0x50
[ 8641.382911] ? do_trap+0xce/0x120
[ 8641.382919] ? __list_del_entry_valid.cold+0x1d/0x47
[ 8641.382930] ? do_error_trap+0x65/0x80
[ 8641.382938] ? __list_del_entry_valid.cold+0x1d/0x47
[ 8641.382948] ? exc_invalid_op+0x4e/0x70
[ 8641.382958] ? __list_del_entry_valid.cold+0x1d/0x47
[ 8641.382975] ? asm_exc_invalid_op+0x16/0x20
[ 8641.382988] ? __list_del_entry_valid.cold+0x1d/0x47
[ 8641.382998] ceph_cap_unlink_work+0x3f/0x140 [ceph]
[ 8641.383021] process_one_work+0x1e2/0x3b0
[ 8641.383032] ? __pfx_worker_thread+0x10/0x10
[ 8641.383043] worker_thread+0x50/0x3a0
[ 8641.383051] ? __pfx_worker_thread+0x10/0x10
[ 8641.383061] kthread+0xdd/0x100
[ 8641.383069] ? __pfx_kthread+0x10/0x10
[ 8641.383078] ret_from_fork+0x29/0x50
[ 8641.383090] </TASK>
[ 8641.383095] Modules linked in: tls ceph libceph dns_resolver
fscache netfs nft_counter ipt_REJECT xt_owner xt_conntrack nft_compat
nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables
libcrc32c nfnetlink vfat fat intel_rapl_msr intel_rapl_common
intel_uncore_frequency_common isst_if_common nfit virtio_gpu iTCO_wdt
iTCO_vendor_support libnvdimm lpc_ich virtio_dma_buf drm_shmem_helper
drm_kms_helper i2c_i801 rapl syscopyarea sysfillrect sysimgblt
virtio_balloon fb_sys_fops i2c_smbus pcspkr joydev fuse drm ext4
mbcache jbd2 sr_mod cdrom sd_mod ahci t10_pi sg libahci
crct10dif_pclmul crc32_pclmul crc32c_intel libata ghash_clmulni_intel
virtio_net virtio_console virtio_scsi net_failover failover serio_raw
```
```
[ 3538.365469] list_del corruption. next->prev should be
ffff8d2b75997c80, but was ffff8d2afcfaae80
[ 3538.365488] ------------[ cut here ]------------
[ 3538.365488] kernel BUG at lib/list_debug.c:54!
[ 3538.365493] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 3538.365553] CPU: 0 PID: 910 Comm: php-fpm Kdump: loaded Not tainted
5.14.0-432.el9.x86_64 #1
[ 3538.365569] Hardware name: oVirt RHEL/RHEL-AV, BIOS
edk2-20230524-4.el9_3 05/24/2023
[ 3538.365582] RIP: 0010:__list_del_entry_valid.cold+0x1d/0x47
[ 3538.365612] Code: c7 c7 78 42 38 8e e8 f9 87 fe ff 0f 0b 48 89 fe
48 c7 c7 08 43 38 8e e8 e8 87 fe ff 0f 0b 48 c7 c7 b8 43 38 8e e8 da
87 fe ff <0f> 0b 48 89 f2 48 89 fe 48 c7 c7 78 43 38 8e e8 c6 87 fe ff
0f 0b
[ 3538.365641] RSP: 0018:ffffae870073fda0 EFLAGS: 00010246
[ 3538.365652] RAX: 0000000000000054 RBX: ffff8d2b75997800 RCX:
0000000000000000
[ 3538.365668] RDX: 0000000000000000 RSI: ffff8d2e2ee20840 RDI:
ffff8d2e2ee20840
[ 3538.365681] RBP: ffff8d2b75997ab8 R08: 80000000ffff842f R09:
0000000000ffff10
[ 3538.365693] R10: 000000000000000f R11: 000000000000000f R12:
00000000ffffc032
[ 3538.365705] R13: ffff8d2b75997c80 R14: ffff8d2ac480b800 R15:
ffff8d2ac480b9c8
[ 3538.365717] FS: 00007f9be42097c0(0000) GS:ffff8d2e2ee00000(0000)
knlGS:0000000000000000
[ 3538.365733] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3538.365744] CR2: 00007fa97dc5398c CR3: 0000000104248004 CR4:
00000000007706f0
[ 3538.365756] PKRU: 55555554
[ 3538.365761] Call Trace:
[ 3538.365768] <TASK>
[ 3538.365774] ? show_trace_log_lvl+0x1c4/0x2df
[ 3538.365785] ? show_trace_log_lvl+0x1c4/0x2df
[ 3538.365796] ? ceph_drop_caps_for_unlink+0xb8/0x170 [ceph]
[ 3538.365828] ? __die_body.cold+0x8/0xd
[ 3538.365836] ? die+0x2b/0x50
[ 3538.365845] ? do_trap+0xce/0x120
[ 3538.365853] ? __list_del_entry_valid.cold+0x1d/0x47
[ 3538.365863] ? do_error_trap+0x65/0x80
[ 3538.365871] ? __list_del_entry_valid.cold+0x1d/0x47
[ 3538.365881] ? exc_invalid_op+0x4e/0x70
[ 3538.365891] ? __list_del_entry_valid.cold+0x1d/0x47
[ 3538.365901] ? asm_exc_invalid_op+0x16/0x20
[ 3538.365912] ? __list_del_entry_valid.cold+0x1d/0x47
[ 3538.365923] ceph_drop_caps_for_unlink+0xb8/0x170 [ceph]
[ 3538.365947] ceph_unlink+0xed/0x450 [ceph]
[ 3538.365970] vfs_unlink+0x114/0x290
[ 3538.365980] do_unlinkat+0x1af/0x2e0
[ 3538.365990] __x64_sys_unlink+0x3e/0x60
[ 3538.365999] do_syscall_64+0x59/0x90
[ 3538.366008] ? syscall_exit_to_user_mode+0x22/0x40
[ 3538.366018] ? do_syscall_64+0x69/0x90
[ 3538.366027] ? do_syscall_64+0x69/0x90
[ 3538.366035] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 3538.366046] RIP: 0033:0x7f9be40ff27b
[ 3538.366069] Code: f0 ff ff 73 01 c3 48 8b 0d a2 ab 0f 00 f7 d8 64
89 01 48 83 c8 ff c3 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 57 00 00
00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 ab 0f 00 f7 d8 64 89
01 48
[ 3538.367031] RSP: 002b:00007ffd8640de58 EFLAGS: 00000246 ORIG_RAX:
0000000000000057
[ 3538.367576] RAX: ffffffffffffffda RBX: 0000000000000008 RCX:
00007f9be40ff27b
[ 3538.368116] RDX: 0000000000000007 RSI: 0000000000000001 RDI:
00007f9bdd4af698
[ 3538.368646] RBP: 00007f9bdd4af698 R08: 00000000ffffffc9 R09:
0000000000000038
[ 3538.369156] R10: 0000000000000000 R11: 0000000000000246 R12:
0000000000000000
[ 3538.369671] R13: 00007f9bdd4af698 R14: 0000000000000001 R15:
00007f9be3c15290
[ 3538.370182] </TASK>
[ 3538.370682] Modules linked in: ceph libceph dns_resolver fscache
netfs nft_counter ipt_REJECT xt_owner xt_conntrack nft_compat
nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables
libcrc32c nfnetlink vfat fat intel_rapl_msr intel_rapl_common
intel_uncore_frequency_common virtio_gpu virtio_dma_buf
drm_shmem_helper isst_if_common drm_kms_helper nfit syscopyarea
sysfillrect sysimgblt fb_sys_fops libnvdimm i2c_i801 iTCO_wdt
iTCO_vendor_support lpc_ich i2c_smbus virtio_balloon rapl joydev
pcspkr drm fuse ext4 mbcache jbd2 sr_mod cdrom sg ahci libahci
crct10dif_pclmul crc32_pclmul crc32c_intel libata ghash_clmulni_intel
virtio_net virtio_blk virtio_console net_failover virtio_scsi failover
serio_raw
```
I checked the changelogs of the kernel versions and spotted these three
commits that were backported:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=dbc347ef7f0c53aa4a5383238a804d7ebbb0b5ca
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=902d6d013f75b68f31d208c6f3ff9cdca82648a7
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=07045648c07c5632e0dfd5ce084d3cd0cec0258a
The first one adds changes that look related.
Does anybody have experienced this as well or know something about this?
Thanks and best regards,
Marc
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx