Hi, I am following up on the discussion during the orchestration weekly meeting (22/3/2022) around HA of nfs-ganesha service deployed by cephadm. When testing the (in-review) PRs for redeploying nfs-ganesha demons from offline hosts, Michael observed that the host going offline was detected quickly and a replacement ganesha daemon was deployed within 2 minutes(?). However, the NFS client I/O to the CephFS backend via ganesha took close to 5 minutes (?) to resume. This was possibly due to the CephFS's single active MDS being co-located with the ganesha daemon in the host that went offline, and the MDS failover being slow. In the CephFS standup, I asked Jeff whether it was necessary to bring up the ganesha daemon within the NFS grace period of 90 seconds. He said it's not always necessary provided certain conditions are met. I'll let Jeff explain this. Regarding the MDS failover (i.e., standby MDS becoming active and taking over) being possibly slow, we could change the default configuration of MDSes deployed by cephadm when creating a CephFS volume. I think 1 active MDS and 1 standby MDS is the default cephadm deployment for a CephFS volume. We can configure the standby to be a standby-replay MDS for a quicker failover, https://docs.ceph.com/en/latest/cephfs/standby/#configuring-standby-replay . Replacing the active MDS server in the offline host was also briefly discussed during the orchestrator weekly. I think the current idea is to extend the optimizations to cephadm to quickly redeploy ganesha daemons in offline hosts to MDS daemons as well. Can the MDS autoscaler mgr plugin that checks the MDSMap for MDS status be used for redeploying MDS daemons, https://github.com/ceph/ceph/blob/v17.1.0/src/pybind/mgr/mds_autoscaler/module.py#L57 ? Thanks, Ramana _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx