Hi, I made a fresh install of Ceph Octopus 15.2.3 recently. And after a few days, the 2 standby MDS suddenly crashed with segmentation fault error. I try to restart it but it does not start. Here is the error : -20> 2020-07-17T13:50:27.888+0000 7fc8c6c51700 10 monclient: _renew_subs -19> 2020-07-17T13:50:27.888+0000 7fc8c6c51700 10 monclient: _send_mon_message to mon.2 at v1:172.31.36.98:6789/0 -18> 2020-07-17T13:50:27.888+0000 7fc8c6c51700 10 monclient: handle_get_version_reply finishing 0x559dcf9530c0 version 269 -17> 2020-07-17T13:50:27.888+0000 7fc8c6c51700 10 monclient: handle_get_version_reply finishing 0x559dcfa87520 version 269 -16> 2020-07-17T13:50:27.888+0000 7fc8c6c51700 10 monclient: handle_get_version_reply finishing 0x559dcfa875c0 version 269 -15> 2020-07-17T13:50:27.888+0000 7fc8c6c51700 10 monclient: handle_get_version_reply finishing 0x559dcfa871c0 version 269 -14> 2020-07-17T13:50:27.888+0000 7fc8c8c55700 10 monclient: get_auth_request con 0x559dcfada000 auth_method 0 -13> 2020-07-17T13:50:27.888+0000 7fc8c9456700 10 monclient: get_auth_request con 0x559dcfada800 auth_method 0 -12> 2020-07-17T13:50:27.892+0000 7fc8bfc43700 1 mds.282966.journaler.mdlog(ro) recover start -11> 2020-07-17T13:50:27.892+0000 7fc8bfc43700 1 mds.282966.journaler.mdlog(ro) read_head -10> 2020-07-17T13:50:27.892+0000 7fc8bfc43700 4 mds.0.log Waiting for journal 0x200 to recover... -9> 2020-07-17T13:50:27.893+0000 7fc8c0444700 1 mds.282966.journaler.mdlog(ro) _finish_read_head loghead(trim 4194304, expire 4231216, write 4329405, stream_format 1). probing for end of log (from 4329405)... -8> 2020-07-17T13:50:27.893+0000 7fc8c0444700 1 mds.282966.journaler.mdlog(ro) probing for end of the log -7> 2020-07-17T13:50:27.893+0000 7fc8c0444700 1 mds.282966.journaler.mdlog(ro) _finish_probe_end write_pos = 4329949 (header had 4329405). recovered. -6> 2020-07-17T13:50:27.893+0000 7fc8bfc43700 4 mds.0.log Journal 0x200 recovered. -5> 2020-07-17T13:50:27.893+0000 7fc8bfc43700 4 mds.0.log Recovered journal 0x200 in format 1 -4> 2020-07-17T13:50:27.893+0000 7fc8bfc43700 2 mds.0.0 Booting: 1: loading/discovering base inodes -3> 2020-07-17T13:50:27.893+0000 7fc8bfc43700 0 mds.0.cache creating system inode with ino:0x100 -2> 2020-07-17T13:50:27.894+0000 7fc8bfc43700 0 mds.0.cache creating system inode with ino:0x1 -1> 2020-07-17T13:50:27.894+0000 7fc8c0444700 2 mds.0.0 Booting: 2: replaying mds log 0> 2020-07-17T13:50:27.896+0000 7fc8bec41700 -1 *** Caught signal (Segmentation fault) ** in thread 7fc8bec41700 thread_name:md_log_replay Here is the cluster information : # ceph status cluster: id: dd024fe1-4996-4fed-ba57-03090e53724d health: HEALTH_WARN 20 daemons have recently crashed services: mon: 3 daemons, quorum 2,0,1 (age 2d) mgr: mgr.0(active, since 9d), standbys: mgr.2, mgr.1 mds: cephfs:1 {0=node0=up:active} 1 up:standby-replay 1 up:standby osd: 3 osds: 3 up (since 28h), 3 in (since 9d) task status: scrub status: mds.node0: idle mds.node2: idle data: pools: 3 pools, 49 pgs objects: 29 objects, 170 KiB usage: 3.0 GiB used, 41 TiB / 41 TiB avail pgs: 49 active+clean io: client: 853 B/s rd, 1 op/s rd, 0 op/s wr There is only 1 client connected to the cluster. Please, does anyone have any idea? Thanks _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx