Re: CEPHFS - MDS gracefull handover of rank 0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/27/21 9:08 AM, Martin Hronek wrote:


So before the next MDS the FS config where changed to one active and one standby-replay node, the idea was that since the MDS replay nodes follows the active one the handover would be smoother. The active state was reached faster, but we still noticed some hiccups on the clients while the new active MDS was waiting for clients to reconnect(state up:reconnect) after the failover.

The next idea was to do a manual node promotion, graceful shutdown or something similar - where the open caps and sessions would be handed over ... but I did not find any hint in the docs regarding this functionality. But, this should somehow be possible (imho), since when adding a second active mds node (max_mds 2) and then removing it again (max_mds 1) the rank 1 node goes to stopping-state and hands over all clients/caps to rank 0 without interruptions for the clients.

Therefore my question: how can one gracefully shutdown an active rank 0 mds node or promote an standby node to the active state without loosing open files/caps or client sessions?

The way to upgrade a cluster, and the current limitations of it, are described here [1]. Most relevant part for you in there:

Currently the MDS cluster does not have built-in versioning or file system flags to support seamless upgrades of the MDSs without potentially causing assertions or other faults due to incompatible messages or other functional differences. For this reason, it’s necessary during any cluster upgrade to reduce the number of active MDS for a file system to one first so that two active MDS do not communicate with different versions. Further, it’s also necessary to take standbys offline as any new CompatSet flags will propagate via the MDSMap to all MDS and cause older MDS to suicide.

So best practices are that you have only _1_ active, upgrade the software of the last running MDS and then restart the MDS.

It would be *really* nice if this could be fixed in a newer version of Ceph. Proably not trivial, but AFAIK the only part of Ceph that gives noticable impact during maintenance (like upgrades). If having this fixed is important for you, make sure you leave a note about this in the upcoming Ceph user survey.

Gr. Stefan

[1]: https://docs.ceph.com/en/latest/cephfs/upgrading/
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux