Re: cephfs snapshot format upgrade

"Yan, Zheng" <ukernel@xxxxxxxxx> · Wed, 11 Apr 2018 11:50:01 +0800



On Wed, Apr 11, 2018 at 3:34 AM, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
> On Tue, Apr 10, 2018 at 5:54 AM, John Spray <jspray@xxxxxxxxxx> wrote:
>> On Tue, Apr 10, 2018 at 1:44 PM, Yan, Zheng <ukernel@xxxxxxxxx> wrote:
>>> Hello
>>>
>>> To simplify snapshot handling in multiple active mds setup, we changed
>>> format of snaprealm in mimic dev.
>>> https://github.com/ceph/ceph/pull/16779.
>>>
>>> The new version mds can handle old format snaprealm in single active
>>> setup. It also can convert old format snaprealm to the new format when
>>> snaprealm is modified. The problem is that new version mds can not
>>> properly handle old format snaprealm in multiple active setup. It may
>>> crash when it encounter old format snaprealm. For existing filesystem
>>> with snapshots, upgrading mds to mimic seems to be no problem at first
>>> glance. But if user later enables multiple active mds,  mds may
>>> crashes continuously. No easy way to switch back to single acitve mds.
>>>
>>> I don't have clear idea how to handle this situation. I can think of a
>>> few options.
>>>
>>> 1. Forbid multiple active before all old snapshots are deleted or
>>> before all snaprealms are converted to new format. Format conversion
>>> requires traversing while whole filesystem tree.  Not easy to
>>> implement.
>>
>> This has been a general problem with metadata format changes: we can
>> never know if all the metadata in a filesystem has been brought up to
>> a particular version.  Scrubbing (where scrub does the updates) should
>> be the answer, but we don't have the mechanism for recording/ensuring
>> the scrub has really happened.
>>
>> Maybe we need the MDS to be able to report a complete whole-filesystem
>> scrub to the monitor, and record a field like
>> "latest_scrubbed_version" in FSMap, so that we can be sure that all
>> the filesystem metadata has been brought up to a certain version
>> before enabling certain features?  So we'd then have a
>> "latest_scrubbed_version >= mimic" test before enabling multiple
>> active daemons.
>
> Don't we have a (recursive!) last_scrub_[stamp|version] on all
> directories? There's not (yet) a mechanism for associating that with
> specific data versions like you describe here, but for a one-time
> upgrade with unsupported features I don't think we need anything too
> sophisticated.
> -Greg
>
No, we don't.  Besides, normal recursive stats (record last update) does not
work for this case. We need a recursive stat that tracks the oldest
update on all
directories..

Regards
Yan, Zheng

>>
>> For this particular situation, we'd also need to protect against
>> people who had enabled multimds and snapshots experimentally, with an
>> MDS startup check like:
>>  ((ever_allowed_features & CEPH_MDSMAP_ALLOW_SNAPS) &&
>> (allows_multimds() || in.size() >1)) && latest_scrubbed_version <
>> mimic
>>
>> John
>>
>>> 2. Ask user to delete all old snapshots before upgrading to mimic,
>>> make mds just ignore old format snaprealms.
>>>
>>>
>>> Regards
>>> Yan, Zheng
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com