On Tue, Apr 10, 2018 at 1:44 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote: > Hello > > To simplify snapshot handling in multiple active mds setup, we changed > format of snaprealm in mimic dev. > https://github.com/ceph/ceph/pull/16779. > > The new version mds can handle old format snaprealm in single active > setup. It also can convert old format snaprealm to the new format when > snaprealm is modified. The problem is that new version mds can not > properly handle old format snaprealm in multiple active setup. It may > crash when it encounter old format snaprealm. For existing filesystem > with snapshots, upgrading mds to mimic seems to be no problem at first > glance. But if user later enables multiple active mds, mds may > crashes continuously. No easy way to switch back to single acitve mds. > > I don't have clear idea how to handle this situation. I can think of a > few options. > > 1. Forbid multiple active before all old snapshots are deleted or > before all snaprealms are converted to new format. Format conversion > requires traversing while whole filesystem tree. Not easy to > implement. This has been a general problem with metadata format changes: we can never know if all the metadata in a filesystem has been brought up to a particular version. Scrubbing (where scrub does the updates) should be the answer, but we don't have the mechanism for recording/ensuring the scrub has really happened. Maybe we need the MDS to be able to report a complete whole-filesystem scrub to the monitor, and record a field like "latest_scrubbed_version" in FSMap, so that we can be sure that all the filesystem metadata has been brought up to a certain version before enabling certain features? So we'd then have a "latest_scrubbed_version >= mimic" test before enabling multiple active daemons. For this particular situation, we'd also need to protect against people who had enabled multimds and snapshots experimentally, with an MDS startup check like: ((ever_allowed_features & CEPH_MDSMAP_ALLOW_SNAPS) && (allows_multimds() || in.size() >1)) && latest_scrubbed_version < mimic John > 2. Ask user to delete all old snapshots before upgrading to mimic, > make mds just ignore old format snaprealms. > > > Regards > Yan, Zheng > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com