On Wed, Feb 28, 2018 at 11:05 AM, John Spray <jspray@xxxxxxxxxx> wrote: > On Wed, Feb 28, 2018 at 9:37 AM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: >> Hi all, >> >> I'm just updating our test cluster from 12.2.2 to 12.2.4. Mon's and >> OSD's updated fine. >> >> When updating the MDS's (we have 2 active and 1 standby), I started >> with the standby. >> >> At the moment the standby MDS restarted into 12.2.4 [1], both active >> MDSs (still running 12.2.2) suicided like this: >> >> 2018-02-28 10:25:22.761413 7f03da1b9700 0 mds.cephdwightmds0 >> handle_mds_map mdsmap compatset compat={},rocompat={},incompat={1=base >> v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir >> inode in separate object,5=mds uses versioned encoding,6=dirfrag is >> stored in omap,8=no anchor table,9=file layout v2} not writeable with >> daemon features compat={},rocompat={},incompat={1=base v0.20,2=client >> writeable ranges,3=default file layouts on dirs,4=dir inode in >> separate object,5=mds uses versioned encoding,6=dirfrag is stored in >> omap,7=mds uses inline data,8=file layout v2}, killing myself >> 2018-02-28 10:25:22.761429 7f03da1b9700 1 mds.cephdwightmds0 suicide. >> wanted state up:active >> 2018-02-28 10:25:23.763226 7f03da1b9700 1 mds.0.18147 shutdown: >> shutting down rank 0 >> >> >> 2018-02-28 10:25:22.761590 7f11df538700 0 mds.cephdwightmds1 >> handle_mds_map mdsmap compatset compat={},rocompat={} >> ,incompat={1=base v0.20,2=client writeable ranges,3=default file >> layouts on dirs,4=dir inode in separate object,5=m >> ds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor >> table,9=file layout v2} not writeable with daemo >> n features compat={},rocompat={},incompat={1=base v0.20,2=client >> writeable ranges,3=default file layouts on dirs,4= >> dir inode in separate object,5=mds uses versioned encoding,6=dirfrag >> is stored in omap,7=mds uses inline data,8=fil >> e layout v2}, killing myself >> 2018-02-28 10:25:22.761613 7f11df538700 1 mds.cephdwightmds1 suicide. >> wanted state up:active >> 2018-02-28 10:25:23.765653 7f11df538700 1 mds.1.18366 shutdown: >> shutting down rank 1 > > That's not good! > > From looking at the commits between 12.2.2 and 12.2.4, this one looks > suspicious: > > commit ddba907279719631903e3a20543056d81d176a1b > Author: Yan, Zheng <zyan@xxxxxxxxxx> > Date: Tue Oct 31 16:56:51 2017 +0800 > > mds: fix MDS_FEATURE_INCOMPAT_FILE_LAYOUT_V2 definition > > Fixes: http://tracker.ceph.com/issues/21985 > Signed-off-by: "Yan, Zheng" <zyan@xxxxxxxxxx> > (cherry picked from commit 6c1543dfc55d6db8493535b9b62a30236cf8c638) Apologies for the noise, my mail client hadn't loaded the earlier responses in which this was already pointed out. John > John > > > >> >> >> The cephfs cluster was down until I updated all MDS's to 12.2.4 -- >> then they restarted cleanly. >> >> Looks like a pretty serious bug??!! >> >> Cheers, Dan >> >> >> [1] here is the standby restarting, 4 seconds before the active MDS's suicided: >> >> 2018-02-28 10:25:18.222865 7f9f1ea3b1c0 0 set uid:gid to 167:167 (ceph:ceph) >> 2018-02-28 10:25:18.222892 7f9f1ea3b1c0 0 ceph version 12.2.4 >> (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable), process >> (unknown), pid 10648 >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com