MDS crashing repeatedly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I have a 18.2.0 Ceph cluster and my MDS are now crashing repeatedly. After a few automatic restart, every MDS is removed and only one stays active. But it's flagged "laggy" and I can't even start a scrub on it.

In the log I have this during crashes:

Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 2023-12-13T14:54:02.721+0000 7f15ea108700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.2.0/rpm/el8/BUILD/ceph-18.2.0/src/mds/MDCache.cc: In function 'void MDCache::journal_cow_dentry(MutationImpl*, EMetaBlob*, CDentry*, snapid_t, CInode**, CDentry::linkage_t*)' thread 7f15ea108700 time 2023-12-13T14:54:02.720383+0000 Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.2.0/rpm/el8/BUILD/ceph-18.2.0/src/mds/MDCache.cc: 1638: FAILED ceph_assert(follows >= realm->get_newest_seq()) Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: ceph version 18.2.0 (5dd24139a1eada541a3bc16b6941c5dde975e26d) reef (stable) Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x135) [0x7f15f5ef9dbb] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 2: /usr/lib64/ceph/libceph-common.so.2(+0x2a8f81) [0x7f15f5ef9f81] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 3: (MDCache::journal_cow_dentry(MutationImpl*, EMetaBlob*, CDentry*, snapid_t, CInode**, CDentry::linkage_t*)+0xae2) [0x55727bc0c672] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 4: (MDCache::journal_dirty_inode(MutationImpl*, EMetaBlob*, CInode*, snapid_t)+0xc5) [0x55727bc0d0d5] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 5: (Locker::scatter_writebehind(ScatterLock*)+0x5f6) [0x55727bce40f6] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 6: (Locker::simple_sync(SimpleLock*, bool*)+0x388) [0x55727bceb908] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 7: (Locker::scatter_nudge(ScatterLock*, MDSContext*, bool)+0x30d) [0x55727bcef25d] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 8: (Locker::scatter_tick()+0x1e7) [0x55727bd0bc37] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 9: (Locker::tick()+0xd) [0x55727bd0c0ed] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 10: (MDSRankDispatcher::tick()+0x1ef) [0x55727bb08e9f] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 11: (Context::complete(int)+0xd) [0x55727bade2cd] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 12: (CommonSafeTimer<ceph::fair_mutex>::timer_thread()+0x16d) [0x7f15f5fea1cd] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 13: (CommonSafeTimerThread<ceph::fair_mutex>::entry()+0x11) [0x7f15f5feb2a1] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 14: /lib64/libpthread.so.0(+0x81ca) [0x7f15f4ca11ca] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 15: clone() Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: *** Caught signal (Aborted) ** Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: in thread 7f15ea108700 thread_name:safe_timer Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: ceph version 18.2.0 (5dd24139a1eada541a3bc16b6941c5dde975e26d) reef (stable) Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 1: /lib64/libpthread.so.0(+0x12cf0) [0x7f15f4cabcf0] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 2: gsignal() Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 3: abort() Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x18f) [0x7f15f5ef9e15] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 5: /usr/lib64/ceph/libceph-common.so.2(+0x2a8f81) [0x7f15f5ef9f81] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 6: (MDCache::journal_cow_dentry(MutationImpl*, EMetaBlob*, CDentry*, snapid_t, CInode**, CDentry::linkage_t*)+0xae2) [0x55727bc0c672] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 7: (MDCache::journal_dirty_inode(MutationImpl*, EMetaBlob*, CInode*, snapid_t)+0xc5) [0x55727bc0d0d5] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 8: (Locker::scatter_writebehind(ScatterLock*)+0x5f6) [0x55727bce40f6] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 9: (Locker::simple_sync(SimpleLock*, bool*)+0x388) [0x55727bceb908] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 10: (Locker::scatter_nudge(ScatterLock*, MDSContext*, bool)+0x30d) [0x55727bcef25d] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 11: (Locker::scatter_tick()+0x1e7) [0x55727bd0bc37] Dec 13 15:54:02 ceph04 ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]: 12: (Locker::tick()+0xd) [0x55727bd0c0ed]


I tried the following to get it back:

ceph fs fail cephfs
cephfs-data-scan cleanup --filesystem cephfs cephfs_data
cephfs-journal-tool --rank cephfs:0 event recover_dentries list
cephfs-table-tool cephfs:all reset session
cephfs-journal-tool --rank cephfs:0 journal reset
cephfs-data-scan scan_extents --worker_n 0 --worker_m 4 --filesystem cephfs cephfs_data cephfs-data-scan scan_inodes --worker_n 0 --worker_m 4 --filesystem cephfs cephfs_data
cephfs-data-scan scan_links --filesystem cephfs
ceph mds repaired 0
ceph fs set cephfs joinable true

(with the data scan commands of course running on several systems simultanously)

Unfortunately it didn't help at all.

I'm quite sure that my undersized hardware is the root cause because problems with metadata already occured in the past (with 17.x) and it was always during times of higher load (e.g. taking Proxmox backups while deleting CephFS snapshots). I now have a strategy to lower the load and update the hardware. But still - I need my data back.

Any ideas?

Cheers,
Thomas
--
http://www.widhalm.or.at
GnuPG : 6265BAE6 , A84CB603
Threema: H7AV7D33
Telegram, Signal: widhalmt@xxxxxxxxxxxxx

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux