Re: Unable to start MDS and access CephFS after upgrade to 17.2.6

Alfred Heisner <al@xxxxxxxxxxx> · Thu, 15 Jun 2023 10:22:39 -0500

I hit a very similar issue with my Ceph cluster that I assumed was caused
by faulty RAM, but the timing also matches the timeframe of my upgrading
from 17.2.5 to 17.2.6. I took the same recovery steps, stopping at online
deep scrub, and everything appears to be working fine now.

On Wed, Jun 14, 2023, 6:55 AM Ben Stöver <bstoever@xxxxxxxxxxx> wrote:

> Hi everyone,
>
> as discussed on this list before, we had an issue upgrading the metadata
> servers while performing an upgrade from 17.2.5 to 17.2.6. (See also
>
> https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/U3VEPCXYDYO2YSGF76CJLU25YOPEB3XU/#EVEP2MEMEI5HAXLYAXMHMWM6ZLJ2KUR6
> .) We had to pause the upgrade leaving us in a state where the MDS were
> running on the old version, while most other cluster components were
> already running the newer version. In the meantime we were able solve
> this issue and also reduced the number of active MDS from 2 to 1.
>
> Today we resumed the upgrade and all components are on version 17.2.6
> now. Right after the upgrade a file system inconsistency (backtrace) was
> found, which we solved by scrubbing. Shortly after users coming back on
> the cluster using the CephFS again, all MDS failed and cannot be started
> again. The first MDS gets stuck in state "replay (laggy)", while both
> standby MDS crash right away ("Caught signal (Segmentation fault)").
>
> All MDS report multiple of the following problems in the log:
> - "replayed ESubtreeMap at xxx subtree root 0x1 not in cache"
> - "replayed ESubtreeMap at xxx subtree root 0x1 is not mine in cache
> (it's -2,-2)"
>
> As a result, we are currently not able to access our CephFS anymore.
> Does anyone have an idea how to solve this or what could be the cause?
> (I have found this ticket, which seems to be very similar but there does
> not seem to be a solution mentioned there.
> https://bugzilla.redhat.com/show_bug.cgi?id=2056935 )
>
> Below are some parts of the MDS logs that seem relevant to us for this
> issue.
>
> We are thankful for any ideas. :-)
>
> Best
> Ben
>
>
> Log excerpt of Active MDS (replay):
>
>    -140> 2023-06-14T07:51:59.585+0000 7feb588c0700  5
> asok(0x55e56e23a000) register_command objecter_requests hook 0x55e56e1ca380
>    -139> 2023-06-14T07:51:59.585+0000 7feb588c0700 10 monclient:
> _renew_subs
>    -138> 2023-06-14T07:51:59.585+0000 7feb588c0700 10 monclient:
> _send_mon_message to mon.ceph-service-01 at v2:131.220.126.65:3300/0
>    -137> 2023-06-14T07:51:59.585+0000 7feb588c0700 10
> log_channel(cluster) update_config to_monitors: true to_syslog: false
> syslog_facility:  prio: info to_graylog: false graylog_host: 127.0.0.1
> graylog_port: 12201)
>    -136> 2023-06-14T07:51:59.585+0000 7feb588c0700  4 mds.0.purge_queue
> operator():  data pool 3 not found in OSDMap
>    -135> 2023-06-14T07:51:59.585+0000 7feb588c0700  4 mds.0.0
> apply_blocklist: killed 0, blocklisted sessions (0 blocklist entries, 0)
>    -134> 2023-06-14T07:51:59.585+0000 7feb588c0700  1 mds.0.172004
> handle_mds_map i am now mds.0.172004
>    -133> 2023-06-14T07:51:59.585+0000 7feb588c0700  1 mds.0.172004
> handle_mds_map state change up:standby --> up:replay
>    -132> 2023-06-14T07:51:59.585+0000 7feb588c0700  5
> mds.beacon.fdi-cephfs.ceph-service-01.gfwudy set_want_state: up:standby
> -> up:replay
>    -131> 2023-06-14T07:51:59.585+0000 7feb588c0700  1 mds.0.172004
> replay_start
>    -130> 2023-06-14T07:51:59.585+0000 7feb588c0700  1 mds.0.172004
> waiting for osdmap 341143 (which blocklists prior instance)
>    -129> 2023-06-14T07:51:59.585+0000 7feb588c0700 10 monclient:
> _send_mon_message to mon.ceph-service-01 at v2:131.220.126.65:3300/0
>    -128> 2023-06-14T07:51:59.585+0000 7feb548b8700  2 mds.0.cache Memory
> usage:  total 300004, rss 34196, heap 182556, baseline 182556, 0 / 0
> inodes have caps, 0 caps, 0 caps per inode
>    -127> 2023-06-14T07:51:59.603+0000 7feb588c0700 10 monclient:
> _renew_subs
>    -126> 2023-06-14T07:51:59.603+0000 7feb588c0700 10 monclient:
> _send_mon_message to mon.ceph-service-01 at v2:131.220.126.65:3300/0
>    -125> 2023-06-14T07:51:59.603+0000 7feb588c0700 10 monclient:
> handle_get_version_reply finishing 1 version 341143
>    -124> 2023-06-14T07:51:59.603+0000 7feb528b4700  2 mds.0.172004
> Booting: 0: opening inotable
>    -123> 2023-06-14T07:51:59.603+0000 7feb528b4700  2 mds.0.172004
> Booting: 0: opening sessionmap
>    -122> 2023-06-14T07:51:59.603+0000 7feb528b4700  2 mds.0.172004
> Booting: 0: opening mds log
>    -121> 2023-06-14T07:51:59.603+0000 7feb528b4700  5 mds.0.log open
> discovering log bounds
>    -120> 2023-06-14T07:51:59.603+0000 7feb528b4700  2 mds.0.172004
> Booting: 0: opening purge queue (async)
>    -119> 2023-06-14T07:51:59.603+0000 7feb528b4700  4 mds.0.purge_queue
> open: opening
>    -118> 2023-06-14T07:51:59.603+0000 7feb528b4700  1
> mds.0.journaler.pq(ro) recover start
>    -117> 2023-06-14T07:51:59.603+0000 7feb528b4700  1
> mds.0.journaler.pq(ro) read_head
>    -116> 2023-06-14T07:51:59.603+0000 7feb520b3700  4
> mds.0.journalpointer Reading journal pointer '400.00000000'
>    -115> 2023-06-14T07:51:59.603+0000 7feb528b4700  2 mds.0.172004
> Booting: 0: loading open file table (async)
>    -114> 2023-06-14T07:51:59.604+0000 7feb528b4700  2 mds.0.172004
> Booting: 0: opening snap table
>    -113> 2023-06-14T07:51:59.604+0000 7feb5c0c7700 10 monclient:
> get_auth_request con 0x55e56ef7e000 auth_method 0
>    -112> 2023-06-14T07:51:59.604+0000 7feb5b8c6700 10 monclient:
> get_auth_request con 0x55e56ef7e800 auth_method 0
>    -111> 2023-06-14T07:51:59.604+0000 7feb5b0c5700 10 monclient:
> get_auth_request con 0x55e56ef7f000 auth_method 0
>    -110> 2023-06-14T07:51:59.604+0000 7feb5c0c7700 10 monclient:
> get_auth_request con 0x55e56ef92000 auth_method 0
>    -109> 2023-06-14T07:51:59.605+0000 7feb538b6700  1
> mds.0.journaler.pq(ro) _finish_read_head loghead(trim 18366857216,
> expire 18367220777, write 18367220777, stream_format 1).  probing for
> end of log (from 18367220777)...
>    -108> 2023-06-14T07:51:59.605+0000 7feb538b6700  1
> mds.0.journaler.pq(ro) probing for end of the log
>    -107> 2023-06-14T07:51:59.605+0000 7feb520b3700  1
> mds.0.journaler.mdlog(ro) recover start
>    -106> 2023-06-14T07:51:59.605+0000 7feb520b3700  1
> mds.0.journaler.mdlog(ro) read_head
>    -105> 2023-06-14T07:51:59.605+0000 7feb520b3700  4 mds.0.log Waiting
> for journal 0x200 to recover...
>    -104> 2023-06-14T07:51:59.605+0000 7feb5b8c6700 10 monclient:
> get_auth_request con 0x55e56ef7f400 auth_method 0
>    -103> 2023-06-14T07:51:59.605+0000 7feb5c0c7700 10 monclient:
> get_auth_request con 0x55e56ef92800 auth_method 0
>    -102> 2023-06-14T07:51:59.605+0000 7feb5b0c5700 10 monclient:
> get_auth_request con 0x55e56ef7f800 auth_method 0
>    -101> 2023-06-14T07:51:59.606+0000 7feb528b4700  1
> mds.0.journaler.mdlog(ro) _finish_read_head loghead(trim
> 320109193199616, expire 320109196838948, write 320109612836052,
> stream_format 1).  probing for end of log (from 320109612836052)...
>    -100> 2023-06-14T07:51:59.606+0000 7feb528b4700  1
> mds.0.journaler.mdlog(ro) probing for end of the log
>     -99> 2023-06-14T07:51:59.606+0000 7feb5b0c5700 10 monclient:
> get_auth_request con 0x55e56f2ea800 auth_method 0
>     -98> 2023-06-14T07:51:59.606+0000 7feb5b8c6700 10 monclient:
> get_auth_request con 0x55e56f2ea000 auth_method 0
>     -97> 2023-06-14T07:51:59.606+0000 7feb538b6700  1
> mds.0.journaler.pq(ro) _finish_probe_end write_pos = 18367220777 (header
> had 18367220777). recovered.
>     -96> 2023-06-14T07:51:59.606+0000 7feb538b6700  4 mds.0.purge_queue
> operator(): open complete
>     -95> 2023-06-14T07:51:59.606+0000 7feb538b6700  1
> mds.0.journaler.pq(ro) set_writeable
>     -94> 2023-06-14T07:51:59.607+0000 7feb528b4700  1
> mds.0.journaler.mdlog(ro) _finish_probe_end write_pos = 320109613139120
> (header had 320109612836052). recovered.
>     -93> 2023-06-14T07:51:59.607+0000 7feb520b3700  4 mds.0.log Journal
> 0x200 recovered.
>     -92> 2023-06-14T07:51:59.607+0000 7feb520b3700  4 mds.0.log
> Recovered journal 0x200 in format 1
>     -91> 2023-06-14T07:51:59.607+0000 7feb520b3700  2 mds.0.172004
> Booting: 1: loading/discovering base inodes
>     -90> 2023-06-14T07:51:59.607+0000 7feb520b3700  0 mds.0.cache
> creating system inode with ino:0x100
>     -89> 2023-06-14T07:51:59.607+0000 7feb520b3700  0 mds.0.cache
> creating system inode with ino:0x1
>     -88> 2023-06-14T07:51:59.608+0000 7feb5c0c7700 10 monclient:
> get_auth_request con 0x55e56f2eb400 auth_method 0
>     -87> 2023-06-14T07:51:59.615+0000 7feb528b4700  2 mds.0.172004
> Booting: 2: replaying mds log
>     -86> 2023-06-14T07:51:59.615+0000 7feb528b4700  2 mds.0.172004
> Booting: 2: waiting for purge queue recovered
>     -85> 2023-06-14T07:51:59.615+0000 7feb5b8c6700 10 monclient:
> get_auth_request con 0x55e56f1cec00 auth_method 0
>     -84> 2023-06-14T07:51:59.615+0000 7feb5b0c5700 10 monclient:
> get_auth_request con 0x55e56f2eb800 auth_method 0
>     -83> 2023-06-14T07:51:59.615+0000 7feb5c0c7700 10 monclient:
> get_auth_request con 0x55e56f1ce400 auth_method 0
>     -82> 2023-06-14T07:51:59.615+0000 7feb5b8c6700 10 monclient:
> get_auth_request con 0x55e56ef7e400 auth_method 0
>     -81> 2023-06-14T07:51:59.615+0000 7feb5b0c5700 10 monclient:
> get_auth_request con 0x55e56f1cf400 auth_method 0
>     -80> 2023-06-14T07:51:59.634+0000 7feb510b1700 -1
> log_channel(cluster) log [ERR] :  replayed ESubtreeMap at
> 320109197452549 subtree root 0x1 not in cache
>     -79> 2023-06-14T07:51:59.634+0000 7feb510b1700  0 mds.0.journal
> journal subtrees: {0x1=[],0x100=[]}
>     -78> 2023-06-14T07:51:59.634+0000 7feb510b1700  0 mds.0.journal
> journal ambig_subtrees:
>     -77> 2023-06-14T07:51:59.637+0000 7feb510b1700 -1
> log_channel(cluster) log [ERR] :  replayed ESubtreeMap at
> 320109198067908 subtree root 0x1 not in cache
>     -76> 2023-06-14T07:51:59.637+0000 7feb510b1700  0 mds.0.journal
> journal subtrees: {0x1=[],0x100=[]}
>     -75> 2023-06-14T07:51:59.637+0000 7feb510b1700  0 mds.0.journal
> journal ambig_subtrees:
>     -74> 2023-06-14T07:51:59.640+0000 7feb510b1700 -1
> log_channel(cluster) log [ERR] :  replayed ESubtreeMap at
> 320109198682422 subtree root 0x1 not in cache
>     -73> 2023-06-14T07:51:59.640+0000 7feb510b1700  0 mds.0.journal
> journal subtrees: {0x1=[],0x100=[]}
>
>
> Log excerpt of standby MDS:
>
>     -23> 2023-06-14T07:48:56.590+0000 7f14fcb97700 -1
> log_channel(cluster) log [ERR] :  replayed ESubtreeMap at
> 320109205645870 subtree root 0x1 is not mine in cache (it's -2,-2)
>     -22> 2023-06-14T07:48:56.590+0000 7f14fcb97700  0 mds.0.journal
> journal subtrees: {0x1=[],0x100=[]}
>     -21> 2023-06-14T07:48:56.590+0000 7f14fcb97700  0 mds.0.journal
> journal ambig_subtrees:
>     -20> 2023-06-14T07:48:56.592+0000 7f14fcb97700 -1
> log_channel(cluster) log [ERR] :  replayed ESubtreeMap at
> 320109206260384 subtree root 0x1 is not mine in cache (it's -2,-2)
>     -19> 2023-06-14T07:48:56.592+0000 7f14fcb97700  0 mds.0.journal
> journal subtrees: {0x1=[],0x100=[]}
>     -18> 2023-06-14T07:48:56.592+0000 7f14fcb97700  0 mds.0.journal
> journal ambig_subtrees:
>     -17> 2023-06-14T07:48:56.595+0000 7f14fcb97700 -1
> log_channel(cluster) log [ERR] :  replayed ESubtreeMap at
> 320109206875743 subtree root 0x1 is not mine in cache (it's -2,-2)
>     -16> 2023-06-14T07:48:56.595+0000 7f14fcb97700  0 mds.0.journal
> journal subtrees: {0x1=[],0x100=[]}
>     -15> 2023-06-14T07:48:56.595+0000 7f14fcb97700  0 mds.0.journal
> journal ambig_subtrees:
>     -14> 2023-06-14T07:48:56.598+0000 7f14fcb97700 -1
> log_channel(cluster) log [ERR] :  replayed ESubtreeMap at
> 320109207491102 subtree root 0x1 is not mine in cache (it's -2,-2)
>     -13> 2023-06-14T07:48:56.598+0000 7f14fcb97700  0 mds.0.journal
> journal subtrees: {0x1=[],0x100=[]}
>     -12> 2023-06-14T07:48:56.598+0000 7f14fcb97700  0 mds.0.journal
> journal ambig_subtrees:
>     -11> 2023-06-14T07:48:56.601+0000 7f14fcb97700 -1
> log_channel(cluster) log [ERR] :  replayed ESubtreeMap at
> 320109208105616 subtree root 0x1 is not mine in cache (it's -2,-2)
>     -10> 2023-06-14T07:48:56.601+0000 7f14fcb97700  0 mds.0.journal
> journal subtrees: {0x1=[],0x100=[]}
>      -9> 2023-06-14T07:48:56.601+0000 7f14fcb97700  0 mds.0.journal
> journal ambig_subtrees:
>      -8> 2023-06-14T07:48:56.638+0000 7f1507bad700 10 monclient:
> get_auth_request con 0x556935cb3400 auth_method 0
>      -7> 2023-06-14T07:48:56.677+0000 7f15073ac700 10 monclient:
> get_auth_request con 0x556935c92c00 auth_method 0
>      -6> 2023-06-14T07:48:56.687+0000 7f1506bab700 10 monclient:
> get_auth_request con 0x5569359f6400 auth_method 0
>      -5> 2023-06-14T07:48:56.967+0000 7f1507bad700 10 monclient:
> get_auth_request con 0x556935cb2800 auth_method 0
>      -4> 2023-06-14T07:48:57.008+0000 7f15073ac700 10 monclient:
> get_auth_request con 0x556935cb8400 auth_method 0
>      -3> 2023-06-14T07:48:57.159+0000 7f1506bab700 10 monclient:
> get_auth_request con 0x556935cb3800 auth_method 0
>      -2> 2023-06-14T07:48:57.169+0000 7f1507bad700 10 monclient:
> get_auth_request con 0x556935cb9000 auth_method 0
>      -1> 2023-06-14T07:48:57.197+0000 7f15073ac700 10 monclient:
> get_auth_request con 0x5569359f7400 auth_method 0
>       0> 2023-06-14T07:48:57.317+0000 7f14fcb97700 -1 *** Caught signal
> (Segmentation fault) **
>   in thread 7f14fcb97700 thread_name:md_log_replay
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx