Re: Mds daemon damaged - assert failed

"Kyriazis, George" <george.kyriazis@xxxxxxxxx> · Wed, 25 Sep 2024 15:49:55 +0000

> On Sep 25, 2024, at 1:05 AM, Eugen Block <eblock@xxxxxx> wrote:
> 
> Great that you got your filesystem back.
> 
>> cephfs-journal-tool journal export
>> cephfs-journal-tool event recover_dentries summary
>> 
>> Both failed
> 
> Your export command seems to be missing the output file, or was it not the exact command?

Yes I didn’t include the output file in my snippet.  Sorry for the confusion.  But the command did in fact complain that the journal was corrupted.

> 
>> Also, I understand that the metadata itself is sitting on the disk, but it looks like a single point of failure.  What’s the logic behind having a simple metadata location, but multiple mds servers?
> 
> I think there's a misunderstanding, the metadata is in the cephfs metadata pool, not on the local disk of your machine.
> 

By “disk” I meant the concept of permanent storage, ie. Ceph.  Yes, our understanding matches.  But the question still remains, as to why that assert would trigger.  Is it because of a software issue (bug?) that caused the journal to be corrupted, or something else corrupted the journal that caused the MDS to throw the assertion?  Basically, I’m trying to find what could be a possible root-cause..

Thank you!

George

> 
> Zitat von "Kyriazis, George" <george.kyriazis@xxxxxxxxx>:
> 
>> I managed to recover my filesystem.
>> 
>> cephfs-journal-tool journal export
>> cephfs-journal-tool event recover_dentries summary
>> 
>> Both failed
>> 
>> But truncating the journal and following some of the instructions in https://people.redhat.com/bhubbard/nature/default/cephfs/disaster-recovery-experts/ helped me to get the mds up.
>> 
>> Then I scrubbed and repaired the filesystem, and I “believe” I’m back in business.
>> 
>> What is weird though is that an assert failed as shown in the stack dump below.  Was that a legitimate assertion that indicates a bigger issue, or was it a false assertion?
>> 
>> Also, I understand that the metadata itself is sitting on the disk, but it looks like a single point of failure.  What’s the logic behind having a simple metadata location, but multiple mds servers?
>> 
>> Thanks!
>> 
>> George
>> 
>> 
>> On Sep 24, 2024, at 5:55 AM, Eugen Block <eblock@xxxxxx> wrote:
>> 
>> Hi,
>> 
>> I would probably start by inspecting the journal with the cephfs-journal-tool [0]:
>> 
>> cephfs-journal-tool [--rank=<fs_name>:{mds-rank|all}] journal inspect
>> 
>> And it could be helful to have the logs prior to the assert.
>> 
>> [0] https://docs.ceph.com/en/latest/cephfs/cephfs-journal-tool/#example-journal-inspect
>> 
>> Zitat von "Kyriazis, George" <george.kyriazis@xxxxxxxxx>:
>> 
>> Hello ceph users,
>> 
>> I am in the unfortunate situation of having a status of “1 mds daemon damaged”.  Looking at the logs, I see that the daemon died with an assert as follows:
>> 
>> ./src/osdc/Journaler.cc: 1368: FAILED ceph_assert(trim_to > trimming_pos)
>> 
>> ceph version 18.2.2 (e9fe820e7fffd1b7cde143a9f77653b73fcec748) reef (stable)
>> 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x12a) [0x73a83189d7d9]
>> 2: /usr/lib/ceph/libceph-common.so.2(+0x29d974) [0x73a83189d974]
>> 3: (Journaler::_trim()+0x671) [0x57235caa70b1]
>> 4: (Journaler::_finish_write_head(int, Journaler::Header&, C_OnFinisher*)+0x171) [0x57235caaa8f1]
>> 5: (Context::complete(int)+0x9) [0x57235c716849]
>> 6: (Finisher::finisher_thread_entry()+0x16d) [0x73a83194659d]
>> 7: /lib/x86_64-linux-gnu/libc.so.6(+0x89134) [0x73a8310a8134]
>> 8: /lib/x86_64-linux-gnu/libc.so.6(+0x1097dc) [0x73a8311287dc]
>> 
>>    0> 2024-09-23T14:10:26.490-0500 73a822c006c0 -1 *** Caught signal (Aborted) **
>> in thread 73a822c006c0 thread_name:MR_Finisher
>> 
>> ceph version 18.2.2 (e9fe820e7fffd1b7cde143a9f77653b73fcec748) reef (stable)
>> 1: /lib/x86_64-linux-gnu/libc.so.6(+0x3c050) [0x73a83105b050]
>> 2: /lib/x86_64-linux-gnu/libc.so.6(+0x8ae2c) [0x73a8310a9e2c]
>> 3: gsignal()
>> 4: abort()
>> 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x185) [0x73a83189d834]
>> 6: /usr/lib/ceph/libceph-common.so.2(+0x29d974) [0x73a83189d974]
>> 7: (Journaler::_trim()+0x671) [0x57235caa70b1]
>> 8: (Journaler::_finish_write_head(int, Journaler::Header&, C_OnFinisher*)+0x171) [0x57235caaa8f1]
>> 9: (Context::complete(int)+0x9) [0x57235c716849]
>> 10: (Finisher::finisher_thread_entry()+0x16d) [0x73a83194659d]
>> 11: /lib/x86_64-linux-gnu/libc.so.6(+0x89134) [0x73a8310a8134]
>> 12: /lib/x86_64-linux-gnu/libc.so.6(+0x1097dc) [0x73a8311287dc]
>> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
>> 
>> 
>> As listed above, I am running 18.2.2 on a proxmox cluster with a hybrid hdd/sdd setup.  2 cephfs filesystems.  The mds responsible for the hdd filesystem is the one that died.
>> 
>> Output of ceph -s follows:
>> 
>> root@vis-mgmt:~/bin# ceph -s
>> cluster:
>>   id:     ec2c9542-dc1b-4af6-9f21-0adbcabb9452
>>   health: HEALTH_ERR
>>           1 filesystem is degraded
>>           1 filesystem is offline
>>           1 mds daemon damaged
>>           5 pgs not scrubbed in time
>>           1 daemons have recently crashed
>>   services:
>>   mon: 5 daemons, quorum vis-hsw-01,vis-skx-01,vis-clx-15,vis-clx-04,vis-icx-00 (age 6m)
>>   mgr: vis-hsw-02(active, since 13d), standbys: vis-skx-02, vis-hsw-04, vis-clx-08, vis-clx-02
>>   mds: 1/2 daemons up, 5 standby
>>   osd: 97 osds: 97 up (since 3h), 97 in (since 4d)
>>   data:
>>   volumes: 1/2 healthy, 1 recovering; 1 damaged
>>   pools:   14 pools, 1961 pgs
>>   objects: 223.70M objects, 304 TiB
>>   usage:   805 TiB used, 383 TiB / 1.2 PiB avail
>>   pgs:     1948 active+clean
>>            9    active+clean+scrubbing+deep
>>            4    active+clean+scrubbing
>>   io:
>>   client:   86 KiB/s rd, 5.5 MiB/s wr, 64 op/s rd, 26 op/s wr
>> 
>> 
>> 
>> I tried restarting all the mds deamons but they are all marked as “standby”.  I also tried restarting all the mons and then the mds daemons again, but that didn’t help.
>> 
>> Much help is appreciated!
>> 
>> Thank you!
>> 
>> George
>> 
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> 
>> 
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
> 
> 
> 

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx