On 05/10/2017 04:18 AM, ML Wong wrote:
While I m troubleshooting the failover of Nfs-Ganesha, the failover is always successful when I shutdown Nfs-Ganesha service online while the OS is running. However, it always failed when I did a either shutdown -r or power-reset. During the failure, the Nfs client was just hung. Like you could not do a "df" or "ls" of the mount point. The share will eventually failover to the remaining expected node usually after 15 - 20 minutes.
The time taken by pacemaker/corosync services to determine if a node is down is usually longer compared to the service down case. But yes it should n't take more than couple of minutes.
Could you please check (may be by constantly querying) on how long it takes for the virtual-IP to failover by using either 'pcs status' or 'ip a' commands. If the IP failover happens quickly but if its just the NFS clients taking time to respond, then we have added usage of portblock feature to speed up client re-connects post failover. The fixes are available (from release-3.9). But before upgrading I suggest to check if the delay is with IP failover or client reconnects post failover.
Thanks, Soumya
Running on Centos7, gluster 3.7.1x, Nfs-Ganesha 2.3.0.x. I currently don't have the resources to upgrade, but if all of experts here think that's the only route. I guess I will have to make a case ... Thanks in advance! _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users