Re: MDS crashing repeatedly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Thomas,

On Wed, Dec 13, 2023 at 8:46 PM Thomas Widhalm <widhalmt@xxxxxxxxxxxxx> wrote:
>
> Hi,
>
> I have a 18.2.0 Ceph cluster and my MDS are now crashing repeatedly.
> After a few automatic restart, every MDS is removed and only one stays
> active. But it's flagged "laggy" and I can't even start a scrub on it.
>
> In the log I have this during crashes:
>
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
> 2023-12-13T14:54:02.721+0000 7f15ea108700 -1
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.2.0/rpm/el8/BUILD/ceph-18.2.0/src/mds/MDCache.cc:
> In function 'void MDCache::journal_cow_dentry(MutationImpl*, EMetaBlob*,
> CDentry*, snapid_t, CInode**, CDentry::linkage_t*)' thread 7f15ea108700
> time 2023-12-13T14:54:02.720383+0000
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
> /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.2.0/rpm/el8/BUILD/ceph-18.2.0/src/mds/MDCache.cc:
> 1638: FAILED ceph_assert(follows >= realm->get_newest_seq())
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   ceph version 18.2.0 (5dd24139a1eada541a3bc16b6941c5dde975e26d) reef
> (stable)
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x135) [0x7f15f5ef9dbb]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   2: /usr/lib64/ceph/libceph-common.so.2(+0x2a8f81) [0x7f15f5ef9f81]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   3: (MDCache::journal_cow_dentry(MutationImpl*, EMetaBlob*, CDentry*,
> snapid_t, CInode**, CDentry::linkage_t*)+0xae2) [0x55727bc0c672]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   4: (MDCache::journal_dirty_inode(MutationImpl*, EMetaBlob*, CInode*,
> snapid_t)+0xc5) [0x55727bc0d0d5]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   5: (Locker::scatter_writebehind(ScatterLock*)+0x5f6) [0x55727bce40f6]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   6: (Locker::simple_sync(SimpleLock*, bool*)+0x388) [0x55727bceb908]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   7: (Locker::scatter_nudge(ScatterLock*, MDSContext*, bool)+0x30d)
> [0x55727bcef25d]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   8: (Locker::scatter_tick()+0x1e7) [0x55727bd0bc37]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   9: (Locker::tick()+0xd) [0x55727bd0c0ed]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   10: (MDSRankDispatcher::tick()+0x1ef) [0x55727bb08e9f]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   11: (Context::complete(int)+0xd) [0x55727bade2cd]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   12: (CommonSafeTimer<ceph::fair_mutex>::timer_thread()+0x16d)
> [0x7f15f5fea1cd]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   13: (CommonSafeTimerThread<ceph::fair_mutex>::entry()+0x11)
> [0x7f15f5feb2a1]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   14: /lib64/libpthread.so.0(+0x81ca) [0x7f15f4ca11ca]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   15: clone()
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
> *** Caught signal (Aborted) **
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   in thread 7f15ea108700 thread_name:safe_timer
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   ceph version 18.2.0 (5dd24139a1eada541a3bc16b6941c5dde975e26d) reef
> (stable)
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   1: /lib64/libpthread.so.0(+0x12cf0) [0x7f15f4cabcf0]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   2: gsignal()
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   3: abort()
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   4: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x18f) [0x7f15f5ef9e15]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   5: /usr/lib64/ceph/libceph-common.so.2(+0x2a8f81) [0x7f15f5ef9f81]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   6: (MDCache::journal_cow_dentry(MutationImpl*, EMetaBlob*, CDentry*,
> snapid_t, CInode**, CDentry::linkage_t*)+0xae2) [0x55727bc0c672]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   7: (MDCache::journal_dirty_inode(MutationImpl*, EMetaBlob*, CInode*,
> snapid_t)+0xc5) [0x55727bc0d0d5]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   8: (Locker::scatter_writebehind(ScatterLock*)+0x5f6) [0x55727bce40f6]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   9: (Locker::simple_sync(SimpleLock*, bool*)+0x388) [0x55727bceb908]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   10: (Locker::scatter_nudge(ScatterLock*, MDSContext*, bool)+0x30d)
> [0x55727bcef25d]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   11: (Locker::scatter_tick()+0x1e7) [0x55727bd0bc37]
> Dec 13 15:54:02 ceph04
> ceph-ff6e50de-ed72-11ec-881c-dca6325c2cc4-mds-mds01-ceph04-krxszj[33486]:
>   12: (Locker::tick()+0xd) [0x55727bd0c0ed]
>
>
> I tried the following to get it back:
>
> ceph fs fail cephfs
> cephfs-data-scan cleanup --filesystem cephfs cephfs_data
> cephfs-journal-tool --rank cephfs:0 event recover_dentries list
> cephfs-table-tool cephfs:all reset session
> cephfs-journal-tool --rank cephfs:0 journal reset
> cephfs-data-scan scan_extents --worker_n 0 --worker_m 4 --filesystem
> cephfs cephfs_data
> cephfs-data-scan scan_inodes --worker_n 0 --worker_m 4 --filesystem
> cephfs cephfs_data
> cephfs-data-scan scan_links --filesystem cephfs
> ceph mds repaired 0
> ceph fs set cephfs joinable true
>
> (with the data scan commands of course running on several systems
> simultanously)
>
> Unfortunately it didn't help at all.

The above steps won't help since this MDS assert is a bug. Do you have
the coredump to share?

>
> I'm quite sure that my undersized hardware is the root cause because
> problems with metadata already occured in the past (with 17.x) and it
> was always during times of higher load (e.g. taking Proxmox backups
> while deleting CephFS snapshots). I now have a strategy to lower the
> load and update the hardware. But still - I need my data back.
>
> Any ideas?
>
> Cheers,
> Thomas
> --
> http://www.widhalm.or.at
> GnuPG : 6265BAE6 , A84CB603
> Threema: H7AV7D33
> Telegram, Signal: widhalmt@xxxxxxxxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx



-- 
Cheers,
Venky
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux