Hi, The [2] is the fix for [1] and should be backported? Currently fields are not filled, so no one knows that backports are needed k > On 27 Sep 2024, at 11:01, Frédéric Nass <frederic.nass@xxxxxxxxxxxxxxxx> wrote: > > Hi George, > > Looks like you hit this one [1]. Can't find the fix [2] in Reef release notes [3]. You'll have to cherry pick it and build sources or wait for it to come to next build. > > Regards, > Frédéric. > > [1] https://tracker.ceph.com/issues/58878 > [2] https://github.com/ceph/ceph/pull/55265 > [3] https://docs.ceph.com/en/latest/releases/reef/#v18-2-4-reef > > ----- Le 24 Sep 24, à 0:32, Kyriazis, George george.kyriazis@xxxxxxxxx a écrit : > >> Hello ceph users, >> >> I am in the unfortunate situation of having a status of “1 mds daemon damaged”. >> Looking at the logs, I see that the daemon died with an assert as follows: >> >> ./src/osdc/Journaler.cc: 1368: FAILED ceph_assert(trim_to > trimming_pos) >> >> ceph version 18.2.2 (e9fe820e7fffd1b7cde143a9f77653b73fcec748) reef (stable) >> 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x12a) >> [0x73a83189d7d9] >> 2: /usr/lib/ceph/libceph-common.so.2(+0x29d974) [0x73a83189d974] >> 3: (Journaler::_trim()+0x671) [0x57235caa70b1] >> 4: (Journaler::_finish_write_head(int, Journaler::Header&, C_OnFinisher*)+0x171) >> [0x57235caaa8f1] >> 5: (Context::complete(int)+0x9) [0x57235c716849] >> 6: (Finisher::finisher_thread_entry()+0x16d) [0x73a83194659d] >> 7: /lib/x86_64-linux-gnu/libc.so.6(+0x89134) [0x73a8310a8134] >> 8: /lib/x86_64-linux-gnu/libc.so.6(+0x1097dc) [0x73a8311287dc] >> >> 0> 2024-09-23T14:10:26.490-0500 73a822c006c0 -1 *** Caught signal (Aborted) ** >> in thread 73a822c006c0 thread_name:MR_Finisher >> >> ceph version 18.2.2 (e9fe820e7fffd1b7cde143a9f77653b73fcec748) reef (stable) >> 1: /lib/x86_64-linux-gnu/libc.so.6(+0x3c050) [0x73a83105b050] >> 2: /lib/x86_64-linux-gnu/libc.so.6(+0x8ae2c) [0x73a8310a9e2c] >> 3: gsignal() >> 4: abort() >> 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x185) >> [0x73a83189d834] >> 6: /usr/lib/ceph/libceph-common.so.2(+0x29d974) [0x73a83189d974] >> 7: (Journaler::_trim()+0x671) [0x57235caa70b1] >> 8: (Journaler::_finish_write_head(int, Journaler::Header&, C_OnFinisher*)+0x171) >> [0x57235caaa8f1] >> 9: (Context::complete(int)+0x9) [0x57235c716849] >> 10: (Finisher::finisher_thread_entry()+0x16d) [0x73a83194659d] >> 11: /lib/x86_64-linux-gnu/libc.so.6(+0x89134) [0x73a8310a8134] >> 12: /lib/x86_64-linux-gnu/libc.so.6(+0x1097dc) [0x73a8311287dc] >> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to >> interpret this. >> >> >> As listed above, I am running 18.2.2 on a proxmox cluster with a hybrid hdd/sdd >> setup. 2 cephfs filesystems. The mds responsible for the hdd filesystem is >> the one that died. >> >> Output of ceph -s follows: >> >> root@vis-mgmt:~/bin# ceph -s >> cluster: >> id: ec2c9542-dc1b-4af6-9f21-0adbcabb9452 >> health: HEALTH_ERR >> 1 filesystem is degraded >> 1 filesystem is offline >> 1 mds daemon damaged >> 5 pgs not scrubbed in time >> 1 daemons have recently crashed >> services: >> mon: 5 daemons, quorum vis-hsw-01,vis-skx-01,vis-clx-15,vis-clx-04,vis-icx-00 >> (age 6m) >> mgr: vis-hsw-02(active, since 13d), standbys: vis-skx-02, vis-hsw-04, >> vis-clx-08, vis-clx-02 >> mds: 1/2 daemons up, 5 standby >> osd: 97 osds: 97 up (since 3h), 97 in (since 4d) >> data: >> volumes: 1/2 healthy, 1 recovering; 1 damaged >> pools: 14 pools, 1961 pgs >> objects: 223.70M objects, 304 TiB >> usage: 805 TiB used, 383 TiB / 1.2 PiB avail >> pgs: 1948 active+clean >> 9 active+clean+scrubbing+deep >> 4 active+clean+scrubbing >> io: >> client: 86 KiB/s rd, 5.5 MiB/s wr, 64 op/s rd, 26 op/s wr >> >> >> >> I tried restarting all the mds deamons but they are all marked as “standby”. I >> also tried restarting all the mons and then the mds daemons again, but that >> didn’t help. >> >> Much help is appreciated! >> >> Thank you! >> >> George >> >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx