nfs-ganesha HA discussion in the orchestration weekly meeting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I am following up on the discussion during the orchestration weekly
meeting (22/3/2022) around HA of nfs-ganesha service deployed by
cephadm. When testing the (in-review) PRs for redeploying nfs-ganesha
demons from offline hosts, Michael observed that the host going
offline was detected quickly and a replacement ganesha daemon was
deployed within 2 minutes(?). However, the NFS client I/O to the
CephFS backend via ganesha took close to 5 minutes (?) to resume. This
was possibly due to the CephFS's single active MDS being co-located
with the ganesha daemon in the host that went offline, and the MDS
failover being slow.

In the CephFS standup, I asked Jeff whether it was necessary to bring
up the ganesha daemon within the NFS grace period of 90 seconds. He
said it's not always necessary provided certain conditions are met.
I'll let Jeff explain this.

Regarding the MDS failover (i.e., standby MDS becoming active and
taking over) being possibly slow, we could change the default
configuration of MDSes deployed by cephadm when creating a CephFS
volume. I think 1 active MDS and 1 standby MDS is the default cephadm
deployment for a CephFS volume.  We can configure the standby to be a
standby-replay MDS for a quicker failover,
https://docs.ceph.com/en/latest/cephfs/standby/#configuring-standby-replay
.

Replacing the active MDS server in the offline host was also briefly
discussed during the orchestrator weekly. I think the current idea is
to extend the optimizations to cephadm to quickly redeploy ganesha
daemons in offline hosts to MDS daemons as well. Can the MDS
autoscaler mgr plugin that checks the MDSMap for MDS status be used
for redeploying MDS daemons,
https://github.com/ceph/ceph/blob/v17.1.0/src/pybind/mgr/mds_autoscaler/module.py#L57
?

Thanks,
Ramana

_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx



[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux