Hello again,we were able to start our MDS-server again. We performed the following steps:
ceph fs fail fdi-cephfs cephfs-journal-tool --rank=cephfs:0 journal export backup.bin cephfs-journal-tool --rank=cephfs:0 journal inspect cephfs-journal-tool --rank=cephfs:all event get list cephfs-journal-tool --rank=cephfs:0 event recover_dentries summary cephfs-journal-tool --rank=cephfs:0 journal resetAfter that, we were able to start the MDS-Server again, by setting "ceph fs set fdi-cephfs max_mds 1"
Now were are running a online scrub: ceph tell mds.0 scrub start /projects recursive,repair,forceIn the documentation [1] there are further steps mentioned after journal reset (like cephfs-table-tool all reset session, ceph fs reset <fs name> --yes-i-really-mean-it, cephfs-journal-tool, cephfs-data-scan ... and so forth)
Is it necessary to perform all these steps? Or is it sufficient to do online scrubbing instead?
Best [1] https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/ On 14.06.23 13:51, Ben Stöver wrote:
Hi everyone,as discussed on this list before, we had an issue upgrading the metadata servers while performing an upgrade from 17.2.5 to 17.2.6. (See also https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/U3VEPCXYDYO2YSGF76CJLU25YOPEB3XU/#EVEP2MEMEI5HAXLYAXMHMWM6ZLJ2KUR6 .) We had to pause the upgrade leaving us in a state where the MDS were running on the old version, while most other cluster components were already running the newer version. In the meantime we were able solve this issue and also reduced the number of active MDS from 2 to 1.Today we resumed the upgrade and all components are on version 17.2.6 now. Right after the upgrade a file system inconsistency (backtrace) was found, which we solved by scrubbing. Shortly after users coming back on the cluster using the CephFS again, all MDS failed and cannot be started again. The first MDS gets stuck in state "replay (laggy)", while both standby MDS crash right away ("Caught signal (Segmentation fault)").All MDS report multiple of the following problems in the log: - "replayed ESubtreeMap at xxx subtree root 0x1 not in cache"- "replayed ESubtreeMap at xxx subtree root 0x1 is not mine in cache (it's -2,-2)"As a result, we are currently not able to access our CephFS anymore. Does anyone have an idea how to solve this or what could be the cause? (I have found this ticket, which seems to be very similar but there does not seem to be a solution mentioned there. https://bugzilla.redhat.com/show_bug.cgi?id=2056935 )Below are some parts of the MDS logs that seem relevant to us for this issue.We are thankful for any ideas. :-) Best Ben Log excerpt of Active MDS (replay):-140> 2023-06-14T07:51:59.585+0000 7feb588c0700 5 asok(0x55e56e23a000) register_command objecter_requests hook 0x55e56e1ca380 -139> 2023-06-14T07:51:59.585+0000 7feb588c0700 10 monclient: _renew_subs -138> 2023-06-14T07:51:59.585+0000 7feb588c0700 10 monclient: _send_mon_message to mon.ceph-service-01 at v2:131.220.126.65:3300/0 -137> 2023-06-14T07:51:59.585+0000 7feb588c0700 10 log_channel(cluster) update_config to_monitors: true to_syslog: false syslog_facility: prio: info to_graylog: false graylog_host: 127.0.0.1 graylog_port: 12201) -136> 2023-06-14T07:51:59.585+0000 7feb588c0700 4 mds.0.purge_queue operator(): data pool 3 not found in OSDMap -135> 2023-06-14T07:51:59.585+0000 7feb588c0700 4 mds.0.0 apply_blocklist: killed 0, blocklisted sessions (0 blocklist entries, 0) -134> 2023-06-14T07:51:59.585+0000 7feb588c0700 1 mds.0.172004 handle_mds_map i am now mds.0.172004 -133> 2023-06-14T07:51:59.585+0000 7feb588c0700 1 mds.0.172004 handle_mds_map state change up:standby --> up:replay -132> 2023-06-14T07:51:59.585+0000 7feb588c0700 5 mds.beacon.fdi-cephfs.ceph-service-01.gfwudy set_want_state: up:standby -> up:replay -131> 2023-06-14T07:51:59.585+0000 7feb588c0700 1 mds.0.172004 replay_start -130> 2023-06-14T07:51:59.585+0000 7feb588c0700 1 mds.0.172004 waiting for osdmap 341143 (which blocklists prior instance) -129> 2023-06-14T07:51:59.585+0000 7feb588c0700 10 monclient: _send_mon_message to mon.ceph-service-01 at v2:131.220.126.65:3300/0 -128> 2023-06-14T07:51:59.585+0000 7feb548b8700 2 mds.0.cache Memory usage: total 300004, rss 34196, heap 182556, baseline 182556, 0 / 0 inodes have caps, 0 caps, 0 caps per inode -127> 2023-06-14T07:51:59.603+0000 7feb588c0700 10 monclient: _renew_subs -126> 2023-06-14T07:51:59.603+0000 7feb588c0700 10 monclient: _send_mon_message to mon.ceph-service-01 at v2:131.220.126.65:3300/0 -125> 2023-06-14T07:51:59.603+0000 7feb588c0700 10 monclient: handle_get_version_reply finishing 1 version 341143 -124> 2023-06-14T07:51:59.603+0000 7feb528b4700 2 mds.0.172004 Booting: 0: opening inotable -123> 2023-06-14T07:51:59.603+0000 7feb528b4700 2 mds.0.172004 Booting: 0: opening sessionmap -122> 2023-06-14T07:51:59.603+0000 7feb528b4700 2 mds.0.172004 Booting: 0: opening mds log -121> 2023-06-14T07:51:59.603+0000 7feb528b4700 5 mds.0.log open discovering log bounds -120> 2023-06-14T07:51:59.603+0000 7feb528b4700 2 mds.0.172004 Booting: 0: opening purge queue (async) -119> 2023-06-14T07:51:59.603+0000 7feb528b4700 4 mds.0.purge_queue open: opening -118> 2023-06-14T07:51:59.603+0000 7feb528b4700 1 mds.0.journaler.pq(ro) recover start -117> 2023-06-14T07:51:59.603+0000 7feb528b4700 1 mds.0.journaler.pq(ro) read_head -116> 2023-06-14T07:51:59.603+0000 7feb520b3700 4 mds.0.journalpointer Reading journal pointer '400.00000000' -115> 2023-06-14T07:51:59.603+0000 7feb528b4700 2 mds.0.172004 Booting: 0: loading open file table (async) -114> 2023-06-14T07:51:59.604+0000 7feb528b4700 2 mds.0.172004 Booting: 0: opening snap table -113> 2023-06-14T07:51:59.604+0000 7feb5c0c7700 10 monclient: get_auth_request con 0x55e56ef7e000 auth_method 0 -112> 2023-06-14T07:51:59.604+0000 7feb5b8c6700 10 monclient: get_auth_request con 0x55e56ef7e800 auth_method 0 -111> 2023-06-14T07:51:59.604+0000 7feb5b0c5700 10 monclient: get_auth_request con 0x55e56ef7f000 auth_method 0 -110> 2023-06-14T07:51:59.604+0000 7feb5c0c7700 10 monclient: get_auth_request con 0x55e56ef92000 auth_method 0 -109> 2023-06-14T07:51:59.605+0000 7feb538b6700 1 mds.0.journaler.pq(ro) _finish_read_head loghead(trim 18366857216, expire 18367220777, write 18367220777, stream_format 1). probing for end of log (from 18367220777)... -108> 2023-06-14T07:51:59.605+0000 7feb538b6700 1 mds.0.journaler.pq(ro) probing for end of the log -107> 2023-06-14T07:51:59.605+0000 7feb520b3700 1 mds.0.journaler.mdlog(ro) recover start -106> 2023-06-14T07:51:59.605+0000 7feb520b3700 1 mds.0.journaler.mdlog(ro) read_head -105> 2023-06-14T07:51:59.605+0000 7feb520b3700 4 mds.0.log Waiting for journal 0x200 to recover... -104> 2023-06-14T07:51:59.605+0000 7feb5b8c6700 10 monclient: get_auth_request con 0x55e56ef7f400 auth_method 0 -103> 2023-06-14T07:51:59.605+0000 7feb5c0c7700 10 monclient: get_auth_request con 0x55e56ef92800 auth_method 0 -102> 2023-06-14T07:51:59.605+0000 7feb5b0c5700 10 monclient: get_auth_request con 0x55e56ef7f800 auth_method 0 -101> 2023-06-14T07:51:59.606+0000 7feb528b4700 1 mds.0.journaler.mdlog(ro) _finish_read_head loghead(trim 320109193199616, expire 320109196838948, write 320109612836052, stream_format 1). probing for end of log (from 320109612836052)... -100> 2023-06-14T07:51:59.606+0000 7feb528b4700 1 mds.0.journaler.mdlog(ro) probing for end of the log -99> 2023-06-14T07:51:59.606+0000 7feb5b0c5700 10 monclient: get_auth_request con 0x55e56f2ea800 auth_method 0 -98> 2023-06-14T07:51:59.606+0000 7feb5b8c6700 10 monclient: get_auth_request con 0x55e56f2ea000 auth_method 0 -97> 2023-06-14T07:51:59.606+0000 7feb538b6700 1 mds.0.journaler.pq(ro) _finish_probe_end write_pos = 18367220777 (header had 18367220777). recovered. -96> 2023-06-14T07:51:59.606+0000 7feb538b6700 4 mds.0.purge_queue operator(): open complete -95> 2023-06-14T07:51:59.606+0000 7feb538b6700 1 mds.0.journaler.pq(ro) set_writeable -94> 2023-06-14T07:51:59.607+0000 7feb528b4700 1 mds.0.journaler.mdlog(ro) _finish_probe_end write_pos = 320109613139120 (header had 320109612836052). recovered. -93> 2023-06-14T07:51:59.607+0000 7feb520b3700 4 mds.0.log Journal 0x200 recovered. -92> 2023-06-14T07:51:59.607+0000 7feb520b3700 4 mds.0.log Recovered journal 0x200 in format 1 -91> 2023-06-14T07:51:59.607+0000 7feb520b3700 2 mds.0.172004 Booting: 1: loading/discovering base inodes -90> 2023-06-14T07:51:59.607+0000 7feb520b3700 0 mds.0.cache creating system inode with ino:0x100 -89> 2023-06-14T07:51:59.607+0000 7feb520b3700 0 mds.0.cache creating system inode with ino:0x1 -88> 2023-06-14T07:51:59.608+0000 7feb5c0c7700 10 monclient: get_auth_request con 0x55e56f2eb400 auth_method 0 -87> 2023-06-14T07:51:59.615+0000 7feb528b4700 2 mds.0.172004 Booting: 2: replaying mds log -86> 2023-06-14T07:51:59.615+0000 7feb528b4700 2 mds.0.172004 Booting: 2: waiting for purge queue recovered -85> 2023-06-14T07:51:59.615+0000 7feb5b8c6700 10 monclient: get_auth_request con 0x55e56f1cec00 auth_method 0 -84> 2023-06-14T07:51:59.615+0000 7feb5b0c5700 10 monclient: get_auth_request con 0x55e56f2eb800 auth_method 0 -83> 2023-06-14T07:51:59.615+0000 7feb5c0c7700 10 monclient: get_auth_request con 0x55e56f1ce400 auth_method 0 -82> 2023-06-14T07:51:59.615+0000 7feb5b8c6700 10 monclient: get_auth_request con 0x55e56ef7e400 auth_method 0 -81> 2023-06-14T07:51:59.615+0000 7feb5b0c5700 10 monclient: get_auth_request con 0x55e56f1cf400 auth_method 0 -80> 2023-06-14T07:51:59.634+0000 7feb510b1700 -1 log_channel(cluster) log [ERR] : replayed ESubtreeMap at 320109197452549 subtree root 0x1 not in cache -79> 2023-06-14T07:51:59.634+0000 7feb510b1700 0 mds.0.journal journal subtrees: {0x1=[],0x100=[]} -78> 2023-06-14T07:51:59.634+0000 7feb510b1700 0 mds.0.journal journal ambig_subtrees: -77> 2023-06-14T07:51:59.637+0000 7feb510b1700 -1 log_channel(cluster) log [ERR] : replayed ESubtreeMap at 320109198067908 subtree root 0x1 not in cache -76> 2023-06-14T07:51:59.637+0000 7feb510b1700 0 mds.0.journal journal subtrees: {0x1=[],0x100=[]} -75> 2023-06-14T07:51:59.637+0000 7feb510b1700 0 mds.0.journal journal ambig_subtrees: -74> 2023-06-14T07:51:59.640+0000 7feb510b1700 -1 log_channel(cluster) log [ERR] : replayed ESubtreeMap at 320109198682422 subtree root 0x1 not in cache -73> 2023-06-14T07:51:59.640+0000 7feb510b1700 0 mds.0.journal journal subtrees: {0x1=[],0x100=[]}Log excerpt of standby MDS:-23> 2023-06-14T07:48:56.590+0000 7f14fcb97700 -1 log_channel(cluster) log [ERR] : replayed ESubtreeMap at 320109205645870 subtree root 0x1 is not mine in cache (it's -2,-2) -22> 2023-06-14T07:48:56.590+0000 7f14fcb97700 0 mds.0.journal journal subtrees: {0x1=[],0x100=[]} -21> 2023-06-14T07:48:56.590+0000 7f14fcb97700 0 mds.0.journal journal ambig_subtrees: -20> 2023-06-14T07:48:56.592+0000 7f14fcb97700 -1 log_channel(cluster) log [ERR] : replayed ESubtreeMap at 320109206260384 subtree root 0x1 is not mine in cache (it's -2,-2) -19> 2023-06-14T07:48:56.592+0000 7f14fcb97700 0 mds.0.journal journal subtrees: {0x1=[],0x100=[]} -18> 2023-06-14T07:48:56.592+0000 7f14fcb97700 0 mds.0.journal journal ambig_subtrees: -17> 2023-06-14T07:48:56.595+0000 7f14fcb97700 -1 log_channel(cluster) log [ERR] : replayed ESubtreeMap at 320109206875743 subtree root 0x1 is not mine in cache (it's -2,-2) -16> 2023-06-14T07:48:56.595+0000 7f14fcb97700 0 mds.0.journal journal subtrees: {0x1=[],0x100=[]} -15> 2023-06-14T07:48:56.595+0000 7f14fcb97700 0 mds.0.journal journal ambig_subtrees: -14> 2023-06-14T07:48:56.598+0000 7f14fcb97700 -1 log_channel(cluster) log [ERR] : replayed ESubtreeMap at 320109207491102 subtree root 0x1 is not mine in cache (it's -2,-2) -13> 2023-06-14T07:48:56.598+0000 7f14fcb97700 0 mds.0.journal journal subtrees: {0x1=[],0x100=[]} -12> 2023-06-14T07:48:56.598+0000 7f14fcb97700 0 mds.0.journal journal ambig_subtrees: -11> 2023-06-14T07:48:56.601+0000 7f14fcb97700 -1 log_channel(cluster) log [ERR] : replayed ESubtreeMap at 320109208105616 subtree root 0x1 is not mine in cache (it's -2,-2) -10> 2023-06-14T07:48:56.601+0000 7f14fcb97700 0 mds.0.journal journal subtrees: {0x1=[],0x100=[]} -9> 2023-06-14T07:48:56.601+0000 7f14fcb97700 0 mds.0.journal journal ambig_subtrees: -8> 2023-06-14T07:48:56.638+0000 7f1507bad700 10 monclient: get_auth_request con 0x556935cb3400 auth_method 0 -7> 2023-06-14T07:48:56.677+0000 7f15073ac700 10 monclient: get_auth_request con 0x556935c92c00 auth_method 0 -6> 2023-06-14T07:48:56.687+0000 7f1506bab700 10 monclient: get_auth_request con 0x5569359f6400 auth_method 0 -5> 2023-06-14T07:48:56.967+0000 7f1507bad700 10 monclient: get_auth_request con 0x556935cb2800 auth_method 0 -4> 2023-06-14T07:48:57.008+0000 7f15073ac700 10 monclient: get_auth_request con 0x556935cb8400 auth_method 0 -3> 2023-06-14T07:48:57.159+0000 7f1506bab700 10 monclient: get_auth_request con 0x556935cb3800 auth_method 0 -2> 2023-06-14T07:48:57.169+0000 7f1507bad700 10 monclient: get_auth_request con 0x556935cb9000 auth_method 0 -1> 2023-06-14T07:48:57.197+0000 7f15073ac700 10 monclient: get_auth_request con 0x5569359f7400 auth_method 0 0> 2023-06-14T07:48:57.317+0000 7f14fcb97700 -1 *** Caught signal (Segmentation fault) **in thread 7f14fcb97700 thread_name:md_log_replay _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx