Re: Client kernel crashes on cephfs access

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> 
> I would like to ask for help regarding client kernel crashes that happen
> on cephfs access. We have been struggling with this for over a month now
> with over 100 crashes on 7 hosts during that time.
> 
> Our cluster runs version 18.2.1. Our clients run CentOS Stream.
> 
> On CentOS Stream 9 the problem started with kernel version
> 5.14.0-425.el9. Version 5.14.0-419.el9 is the last one without problems.
> It also occurred on CentOS Stream 8, starting with version
> 4.18.0-546.el8 (4.18.0-544.el8 being the last good one).
> 
> The problem presents itself by the client kernel crashing, forcing a
> reboot of the machine. Apparently it is triggered by a certain level of
> IO on the cephfs mount. It works perfectly fine when we rollback to the
> last good kernel version.
> 
> The exact call trace in vmcore-dmesg.txt differs between occurrences.
> Here are two typical examples:
> 
> ```
> [ 8641.382499] list_del corruption. next->prev should be
> ffff88bd0a4d4c80, but was ffff88bcefdfd280
> [ 8641.382521] ------------[ cut here ]------------
> [ 8641.382521] kernel BUG at lib/list_debug.c:54!
> [ 8641.382528] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> [ 8641.382591] CPU: 2 PID: 83929 Comm: kworker/2:0 Kdump: loaded Not
> tainted 5.14.0-432.el9.x86_64 #1
> [ 8641.382610] Hardware name: oVirt RHEL/RHEL-AV, BIOS edk2-20230524-
> 4.el9_3 05/24/2023
> [ 8641.382624] Workqueue: ceph-cap ceph_cap_unlink_work [ceph]
> [ 8641.382662] RIP: 0010:__list_del_entry_valid.cold+0x1d/0x47
> [ 8641.382681] Code: c7 c7 78 42 d8 b1 e8 f9 87 fe ff 0f 0b 48 89 fe 48
> c7 c7 08 43 d8 b1 e8 e8 87 fe ff 0f 0b 48 c7 c7 b8 43 d8 b1 e8 da 87 fe
> ff <0f> 0b 48 89 f2 48 89 fe 48 c7 c7 78 43 d8 b1 e8 c6 87 fe ff 0f 0b
> [ 8641.382711] RSP: 0018:ffff95a000d6be60 EFLAGS: 00010246
> [ 8641.382722] RAX: 0000000000000054 RBX: ffff88bced76dc00 RCX:
> 0000000000000000
> [ 8641.382734] RDX: 0000000000000000 RSI: ffff88c02eea0840 RDI:
> ffff88c02eea0840
> [ 8641.382746] RBP: ffff88bd0a4d4c80 R08: 80000000ffff8434 R09:
> 0000000000ffff10
> [ 8641.382758] R10: 000000000000000f R11: 000000000000000f R12:
> ffff88c02eeb2800
> [ 8641.382779] R13: ffff88bcc4610258 R14: ffff88bcc46101b8 R15:
> ffff88bcc46101c8
> [ 8641.382793] FS:  0000000000000000(0000) GS:ffff88c02ee80000(0000)
> knlGS:0000000000000000
> [ 8641.382809] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 8641.382819] CR2: 00007f35cee8a000 CR3: 0000000105708004 CR4:
> 00000000007706e0
> [ 8641.382832] PKRU: 55555554
> [ 8641.382838] Call Trace:
> [ 8641.382844]  <TASK>
> [ 8641.382850]  ? show_trace_log_lvl+0x1c4/0x2df
> [ 8641.382860]  ? show_trace_log_lvl+0x1c4/0x2df
> [ 8641.382870]  ? ceph_cap_unlink_work+0x3f/0x140 [ceph]
> [ 8641.382893]  ? __die_body.cold+0x8/0xd
> [ 8641.382902]  ? die+0x2b/0x50
> [ 8641.382911]  ? do_trap+0xce/0x120
> [ 8641.382919]  ? __list_del_entry_valid.cold+0x1d/0x47
> [ 8641.382930]  ? do_error_trap+0x65/0x80
> [ 8641.382938]  ? __list_del_entry_valid.cold+0x1d/0x47
> [ 8641.382948]  ? exc_invalid_op+0x4e/0x70
> [ 8641.382958]  ? __list_del_entry_valid.cold+0x1d/0x47
> [ 8641.382975]  ? asm_exc_invalid_op+0x16/0x20
> [ 8641.382988]  ? __list_del_entry_valid.cold+0x1d/0x47
> [ 8641.382998]  ceph_cap_unlink_work+0x3f/0x140 [ceph]
> [ 8641.383021]  process_one_work+0x1e2/0x3b0
> [ 8641.383032]  ? __pfx_worker_thread+0x10/0x10
> [ 8641.383043]  worker_thread+0x50/0x3a0
> [ 8641.383051]  ? __pfx_worker_thread+0x10/0x10
> [ 8641.383061]  kthread+0xdd/0x100
> [ 8641.383069]  ? __pfx_kthread+0x10/0x10
> [ 8641.383078]  ret_from_fork+0x29/0x50
> [ 8641.383090]  </TASK>
> [ 8641.383095] Modules linked in: tls ceph libceph dns_resolver fscache
> netfs nft_counter ipt_REJECT xt_owner xt_conntrack nft_compat
> nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet
> nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat
> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables
> libcrc32c nfnetlink vfat fat intel_rapl_msr intel_rapl_common
> intel_uncore_frequency_common isst_if_common nfit virtio_gpu iTCO_wdt
> iTCO_vendor_support libnvdimm lpc_ich virtio_dma_buf drm_shmem_helper
> drm_kms_helper i2c_i801 rapl syscopyarea sysfillrect sysimgblt
> virtio_balloon fb_sys_fops i2c_smbus pcspkr joydev fuse drm ext4 mbcache
> jbd2 sr_mod cdrom sd_mod ahci t10_pi sg libahci crct10dif_pclmul
> crc32_pclmul crc32c_intel libata ghash_clmulni_intel virtio_net
> virtio_console virtio_scsi net_failover failover serio_raw
> ```
> 
> ```
> [ 3538.365469] list_del corruption. next->prev should be
> ffff8d2b75997c80, but was ffff8d2afcfaae80
> [ 3538.365488] ------------[ cut here ]------------
> [ 3538.365488] kernel BUG at lib/list_debug.c:54!
> [ 3538.365493] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> [ 3538.365553] CPU: 0 PID: 910 Comm: php-fpm Kdump: loaded Not tainted
> 5.14.0-432.el9.x86_64 #1
> [ 3538.365569] Hardware name: oVirt RHEL/RHEL-AV, BIOS edk2-20230524-
> 4.el9_3 05/24/2023
> [ 3538.365582] RIP: 0010:__list_del_entry_valid.cold+0x1d/0x47
> [ 3538.365612] Code: c7 c7 78 42 38 8e e8 f9 87 fe ff 0f 0b 48 89 fe 48
> c7 c7 08 43 38 8e e8 e8 87 fe ff 0f 0b 48 c7 c7 b8 43 38 8e e8 da 87 fe
> ff <0f> 0b 48 89 f2 48 89 fe 48 c7 c7 78 43 38 8e e8 c6 87 fe ff 0f 0b
> [ 3538.365641] RSP: 0018:ffffae870073fda0 EFLAGS: 00010246
> [ 3538.365652] RAX: 0000000000000054 RBX: ffff8d2b75997800 RCX:
> 0000000000000000
> [ 3538.365668] RDX: 0000000000000000 RSI: ffff8d2e2ee20840 RDI:
> ffff8d2e2ee20840
> [ 3538.365681] RBP: ffff8d2b75997ab8 R08: 80000000ffff842f R09:
> 0000000000ffff10
> [ 3538.365693] R10: 000000000000000f R11: 000000000000000f R12:
> 00000000ffffc032
> [ 3538.365705] R13: ffff8d2b75997c80 R14: ffff8d2ac480b800 R15:
> ffff8d2ac480b9c8
> [ 3538.365717] FS:  00007f9be42097c0(0000) GS:ffff8d2e2ee00000(0000)
> knlGS:0000000000000000
> [ 3538.365733] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 3538.365744] CR2: 00007fa97dc5398c CR3: 0000000104248004 CR4:
> 00000000007706f0
> [ 3538.365756] PKRU: 55555554
> [ 3538.365761] Call Trace:
> [ 3538.365768]  <TASK>
> [ 3538.365774]  ? show_trace_log_lvl+0x1c4/0x2df
> [ 3538.365785]  ? show_trace_log_lvl+0x1c4/0x2df
> [ 3538.365796]  ? ceph_drop_caps_for_unlink+0xb8/0x170 [ceph]
> [ 3538.365828]  ? __die_body.cold+0x8/0xd
> [ 3538.365836]  ? die+0x2b/0x50
> [ 3538.365845]  ? do_trap+0xce/0x120
> [ 3538.365853]  ? __list_del_entry_valid.cold+0x1d/0x47
> [ 3538.365863]  ? do_error_trap+0x65/0x80
> [ 3538.365871]  ? __list_del_entry_valid.cold+0x1d/0x47
> [ 3538.365881]  ? exc_invalid_op+0x4e/0x70
> [ 3538.365891]  ? __list_del_entry_valid.cold+0x1d/0x47
> [ 3538.365901]  ? asm_exc_invalid_op+0x16/0x20
> [ 3538.365912]  ? __list_del_entry_valid.cold+0x1d/0x47
> [ 3538.365923]  ceph_drop_caps_for_unlink+0xb8/0x170 [ceph]
> [ 3538.365947]  ceph_unlink+0xed/0x450 [ceph]
> [ 3538.365970]  vfs_unlink+0x114/0x290
> [ 3538.365980]  do_unlinkat+0x1af/0x2e0
> [ 3538.365990]  __x64_sys_unlink+0x3e/0x60
> [ 3538.365999]  do_syscall_64+0x59/0x90
> [ 3538.366008]  ? syscall_exit_to_user_mode+0x22/0x40
> [ 3538.366018]  ? do_syscall_64+0x69/0x90
> [ 3538.366027]  ? do_syscall_64+0x69/0x90
> [ 3538.366035]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [ 3538.366046] RIP: 0033:0x7f9be40ff27b
> [ 3538.366069] Code: f0 ff ff 73 01 c3 48 8b 0d a2 ab 0f 00 f7 d8 64 89
> 01 48 83 c8 ff c3 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 57 00 00 00 0f
> 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 ab 0f 00 f7 d8 64 89 01 48
> [ 3538.367031] RSP: 002b:00007ffd8640de58 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000057
> [ 3538.367576] RAX: ffffffffffffffda RBX: 0000000000000008 RCX:
> 00007f9be40ff27b
> [ 3538.368116] RDX: 0000000000000007 RSI: 0000000000000001 RDI:
> 00007f9bdd4af698
> [ 3538.368646] RBP: 00007f9bdd4af698 R08: 00000000ffffffc9 R09:
> 0000000000000038
> [ 3538.369156] R10: 0000000000000000 R11: 0000000000000246 R12:
> 0000000000000000
> [ 3538.369671] R13: 00007f9bdd4af698 R14: 0000000000000001 R15:
> 00007f9be3c15290
> [ 3538.370182]  </TASK>
> [ 3538.370682] Modules linked in: ceph libceph dns_resolver fscache netfs
> nft_counter ipt_REJECT xt_owner xt_conntrack nft_compat nft_fib_inet
> nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4
> nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack
> nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables libcrc32c nfnetlink
> vfat fat intel_rapl_msr intel_rapl_common intel_uncore_frequency_common
> virtio_gpu virtio_dma_buf drm_shmem_helper isst_if_common drm_kms_helper
> nfit syscopyarea sysfillrect sysimgblt fb_sys_fops libnvdimm i2c_i801
> iTCO_wdt iTCO_vendor_support lpc_ich i2c_smbus virtio_balloon rapl joydev
> pcspkr drm fuse ext4 mbcache jbd2 sr_mod cdrom sg ahci libahci
> crct10dif_pclmul crc32_pclmul crc32c_intel libata ghash_clmulni_intel
> virtio_net virtio_blk virtio_console net_failover virtio_scsi failover
> serio_raw
> ```
> 
> I checked the changelogs of the kernel versions and spotted these three
> commits that were backported:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit
> /?id=dbc347ef7f0c53aa4a5383238a804d7ebbb0b5ca
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit
> /?id=902d6d013f75b68f31d208c6f3ff9cdca82648a7
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit
> /?id=07045648c07c5632e0dfd5ce084d3cd0cec0258a
> 
> The first one adds changes that look related.
> 
> Does anybody have experienced this as well or know something about this?
> 

I have a guaranteed crash + reboot with el7 - nautilus accessing a snapshot.

rbd snap ls vps-xxx -p rbd
rbd map vps-xxx@vps-xxx.bak1 -p rbd

some lvm stuff like this (pvscan --cache; pvs; lvchange -a y VGxxx/LVyyy)

mount -o ro /dev/mapper/VGxxx-LVyyy /mnt/disk <---- C04 CRASH!!!!

I bypass this now with creating a clone of the snapshot.


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux