On Wed, Feb 28, 2018 at 9:37 AM, Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: > Hi all, > > I'm just updating our test cluster from 12.2.2 to 12.2.4. Mon's and > OSD's updated fine. > > When updating the MDS's (we have 2 active and 1 standby), I started > with the standby. > > At the moment the standby MDS restarted into 12.2.4 [1], both active > MDSs (still running 12.2.2) suicided like this: > > 2018-02-28 10:25:22.761413 7f03da1b9700 0 mds.cephdwightmds0 > handle_mds_map mdsmap compatset compat={},rocompat={},incompat={1=base > v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir > inode in separate object,5=mds uses versioned encoding,6=dirfrag is > stored in omap,8=no anchor table,9=file layout v2} not writeable with > daemon features compat={},rocompat={},incompat={1=base v0.20,2=client > writeable ranges,3=default file layouts on dirs,4=dir inode in > separate object,5=mds uses versioned encoding,6=dirfrag is stored in > omap,7=mds uses inline data,8=file layout v2}, killing myself > 2018-02-28 10:25:22.761429 7f03da1b9700 1 mds.cephdwightmds0 suicide. > wanted state up:active > 2018-02-28 10:25:23.763226 7f03da1b9700 1 mds.0.18147 shutdown: > shutting down rank 0 > > > 2018-02-28 10:25:22.761590 7f11df538700 0 mds.cephdwightmds1 > handle_mds_map mdsmap compatset compat={},rocompat={} > ,incompat={1=base v0.20,2=client writeable ranges,3=default file > layouts on dirs,4=dir inode in separate object,5=m > ds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor > table,9=file layout v2} not writeable with daemo > n features compat={},rocompat={},incompat={1=base v0.20,2=client > writeable ranges,3=default file layouts on dirs,4= > dir inode in separate object,5=mds uses versioned encoding,6=dirfrag > is stored in omap,7=mds uses inline data,8=fil > e layout v2}, killing myself > 2018-02-28 10:25:22.761613 7f11df538700 1 mds.cephdwightmds1 suicide. > wanted state up:active > 2018-02-28 10:25:23.765653 7f11df538700 1 mds.1.18366 shutdown: > shutting down rank 1 That's not good! >From looking at the commits between 12.2.2 and 12.2.4, this one looks suspicious: commit ddba907279719631903e3a20543056d81d176a1b Author: Yan, Zheng <zyan@xxxxxxxxxx> Date: Tue Oct 31 16:56:51 2017 +0800 mds: fix MDS_FEATURE_INCOMPAT_FILE_LAYOUT_V2 definition Fixes: http://tracker.ceph.com/issues/21985 Signed-off-by: "Yan, Zheng" <zyan@xxxxxxxxxx> (cherry picked from commit 6c1543dfc55d6db8493535b9b62a30236cf8c638) John > > > The cephfs cluster was down until I updated all MDS's to 12.2.4 -- > then they restarted cleanly. > > Looks like a pretty serious bug??!! > > Cheers, Dan > > > [1] here is the standby restarting, 4 seconds before the active MDS's suicided: > > 2018-02-28 10:25:18.222865 7f9f1ea3b1c0 0 set uid:gid to 167:167 (ceph:ceph) > 2018-02-28 10:25:18.222892 7f9f1ea3b1c0 0 ceph version 12.2.4 > (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable), process > (unknown), pid 10648 > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com