Do you have the startup banners for mds.cccephadm14 and 15? It sure looks like they were running 12.2.2 with the "not writeable with daemon features" error. -- dan On Wed, Mar 28, 2018 at 3:12 PM, adrien.georget@xxxxxxxxxxx <adrien.georget@xxxxxxxxxxx> wrote: > Hi, > > All Ceph services were in 12.2.4 version. > > Adrien > > > Le 28/03/2018 à 14:47, Dan van der Ster a écrit : >> >> Hi, >> >> Which versions were those MDS's before and after the restarted standby >> MDS? >> >> Cheers, Dan >> >> >> >> On Wed, Mar 28, 2018 at 11:11 AM, adrien.georget@xxxxxxxxxxx >> <adrien.georget@xxxxxxxxxxx> wrote: >>> >>> Hi, >>> >>> I just had the same issue with our 12.2.4 cluster but not during the >>> upgrade. >>> One of our 3 monitors restarted (the one with a standby MDS) and the 2 >>> others active MDS killed themselves : >>> >>> 2018-03-28 09:36:24.376888 7f910bc0f700 0 mds.cccephadm14 handle_mds_map >>> mdsmap compatset compat={},rocompat={},incompat={1=base v0.20,2=client >>> writeable ranges,3=default file layouts on dirs,4=dir inode in separate >>> object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no >>> anchor >>> table,9=file layout v2} not writeable with daemon features >>> compat={},rocompat={},incompat={1=base v0.20,2=client writeable >>> ranges,3=default file layouts on dirs,4=dir inode in separate >>> object,5=mds >>> uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline >>> data,8=file layout v2}, killing myself >>> 2018-03-28 09:36:24.376903 7f910bc0f700 1 mds.cccephadm14 suicide. >>> wanted >>> state up:active >>> 2018-03-28 09:36:25.379607 7f910bc0f700 1 mds.1.62 shutdown: shutting >>> down >>> rank 1 >>> >>> >>> 2018-03-28 09:36:24.375867 7fad455bf700 0 mds.cccephadm15 handle_mds_map >>> mdsmap compatset compat={},rocompat={},incompat={1=base v0.20,2=client >>> writeable ranges,3=default file layouts on dirs,4=dir inode in separate >>> object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no >>> anchor >>> table,9=file layout v2} not writeable with daemon features >>> compat={},rocompat={},incompat={1=base v0.20,2=client writeable >>> ranges,3=default file layouts on dirs,4=dir inode in separate >>> object,5=mds >>> uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline >>> data,8=file layout v2}, killing myself >>> 2018-03-28 09:36:24.375883 7fad455bf700 1 mds.cccephadm15 suicide. >>> wanted >>> state up:active >>> 2018-03-28 09:36:25.377633 7fad455bf700 1 mds.0.50 shutdown: shutting >>> down >>> rank 0 >>> >>> I had to restart manually the MDS services to get it works. >>> >>> Adrien >>> >>> >>> Le 21/03/2018 à 11:37, Martin Palma a écrit : >>>> >>>> Just run into this problem on our production cluster.... >>>> >>>> It would have been nice if the release notes of 12.2.4 had been >>>> adapted to inform user about this. >>>> >>>> Best, >>>> Martin >>>> >>>> On Wed, Mar 14, 2018 at 9:53 PM, Gregory Farnum <gfarnum@xxxxxxxxxx> >>>> wrote: >>>>> >>>>> On Wed, Mar 14, 2018 at 12:41 PM, Lars Marowsky-Bree <lmb@xxxxxxxx> >>>>> wrote: >>>>>> >>>>>> On 2018-03-14T06:57:08, Patrick Donnelly <pdonnell@xxxxxxxxxx> wrote: >>>>>> >>>>>>> Yes. But the real outcome is not "no MDS [is] active" but "some or >>>>>>> all >>>>>>> metadata I/O will pause" -- and there is no avoiding that. During an >>>>>>> MDS upgrade, a standby must take over the MDS being shutdown (and >>>>>>> upgraded). During takeover, metadata I/O will briefly pause as the >>>>>>> rank is unavailable. (Specifically, no other rank can obtains locks >>>>>>> or >>>>>>> communicate with the "failed" rank; so metadata I/O will necessarily >>>>>>> pause until a standby takes over.) Single active vs. multiple active >>>>>>> upgrade makes little difference in this outcome. >>>>>> >>>>>> Fair, except that there's no standby MDS at this time in case the >>>>>> update >>>>>> goes wrong. >>>>>> >>>>>>>> Is another approach theoretically feasible? Have the updated MDS >>>>>>>> only >>>>>>>> go >>>>>>>> into the incompatible mode once there's a quorum of new ones >>>>>>>> available, >>>>>>>> or something? >>>>>>> >>>>>>> I believe so, yes. That option wasn't explored for this patch because >>>>>>> it was just disambiguating the compatibility flags and the full >>>>>>> side-effects weren't realized. >>>>>> >>>>>> Would such a patch be accepted if we ended up pursuing this? Any >>>>>> suggestions on how to best go about this? >>>>> >>>>> It'd be ugly, but you'd have to set it up so that >>>>> * new MDSes advertise the old set of required values >>>>> * but can identify when all the MDSes are new >>>>> * then mark somewhere that they can use the correct values >>>>> * then switch to the proper requirements >>>>> >>>>> I don't remember the details of this CompatSet code any more, and it's >>>>> definitely made trickier by the MDS having no permanent local state. >>>>> Since we do luckily have both the IDs and the strings, you might be >>>>> able to do something in the MDSMonitor to identify whether booting >>>>> MDSes have "too-old", "old-featureset-but-support-new-feature", or >>>>> "new, correct feature advertising" and then either massage that >>>>> incoming message down to the "old-featureset-but-support-new-feature" >>>>> (if not all the MDSes are new) or do an auto-upgrade of the required >>>>> features in the map. And you might also need compatibility code in the >>>>> MDS to make sure it sends out the appropriate bits on connection, but >>>>> I *think* the CompatSet checks are only done on the monitor and when >>>>> an MDS receives an MDSMap. >>>>> -Greg >>>>> _______________________________________________ >>>>> ceph-users mailing list >>>>> ceph-users@xxxxxxxxxxxxxx >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>>> _______________________________________________ >>>> ceph-users mailing list >>>> ceph-users@xxxxxxxxxxxxxx >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com