Re: Strange - Missing hostname-trigger_ip-1 resources

ML Wong <wongmlb@xxxxxxxxx> · Mon, 6 Feb 2017 09:13:45 -0800

Thanks, Soumya, for giving me a confirmation on the time for the take-over process. That's very helpful.

The 'showmount' got hung on the destination host while i see ganesha-nfsd was running. 
On that, 1 thing i tried out-of-ordinary was to set a 2nd set of *cluster_ip for the storage-nodes using 'pcs'.  Meaning, other than the VIP i have in 'ganesha-ha.conf', i also used 'pcs' to create another set of VIPs, and thought of using 'constraint colocation' to get both set of VIPs and NFSd going. Though both set of VIPs did failover to the destination node, but the ganesha-nfsd was hung at the destination node when the source node got brought down. I am guessing this setup probably caused the confusion of nfs-ganesha ... ? Is that a No-No to have more than 1 set of VIPs for NFS-Ganesha in different subnets?

Any suggestions, and advice will be appreciated.

On Sun, Feb 5, 2017 at 11:50 PM, Soumya Koduri <skoduri@xxxxxxxxxx> wrote:

On 02/04/2017 06:20 AM, ML Wong wrote:

Thanks so much for your promptly response, Soumya.

That helps clearing out one of my questions. I am trying to figure out

why NFS service did not failover/pick-up the NFS clients last time when

one of our cluster-nodes failed.

Though i could see, in corosync.log, a notify got sent to the cluster

the failed node, the election, and the IP failover process seems to all

be finished with in around minute. However, after the IP failover to the

destinated node, i tried to do a "showmount -e localhost" - the command

got hung.But, i still see ganesha-nfsd is running in the host.

Is it on destination node? I mean does 'showmount -e localhost' get hung on destination node post IP failover.

To your

expertise, if i understand the process correctly, given that i keep all

the default timeout/interval settings for nfs-mon, nfs-grace, the entire

IP failover, and NFS service failover process should be completed within

2 minutes. Am i correct?

Thats right.

Thanks,

Soumya

Your help is again appreciated.

On Thu, Feb 2, 2017 at 11:42 PM, Soumya Koduri <skoduri@xxxxxxxxxx

<mailto:skoduri@xxxxxxxxxx>> wrote:

    Hi,

    On 02/03/2017 07:52 AM, ML Wong wrote:

        Hello All,

        Any pointers will be very-much appreciated. Thanks in advance!

        Environment:

        Running centOS 7.2.511

        Gluster: 3.7.16, with nfs-ganesha on 2.3.0.1 from

        centos-gluster37 repo

        sha1: cab5df4064e3a31d1d92786d91bd41d91517fba8  ganesha-ha.sh

        we have used this set up in 3 different gluster, nfs-ganesha

        environment. The cluster got setup when we do 'gluster nfs-ganesha

        enable' , and we can serve NFS without issues. And i see all the

        resources got created, but not the *hostname*-trigger_ip-1

        resources? Is

        that normal?

    Yes it is normal. With change [1], new resource agent attributes

    have been introduced in place of *-trigger_ip-1 to monitor, move the

    VIP  and put the cluster in grace. More details are in the change#

    commit msg.

    Thanks,

    Soumya

    [1]

    https://github.com/gluster/glusterfs/commit/e8121c4afb3680f532b450872b5a3ffcb3766a97

    <https://github.com/gluster/glusterfs/commit/e8121c4afb3680f532b450872b5a3ffcb3766a97>

        without *hostname*-trigger_ip-1, according to ganesha-ha.sh,

        wouldn't it

        affect the NFS going into grace, and help to transition the NFS

        service

        to other member nodes at the times of node-failures? please

        correct me

        if i misunderstood.

        I tried issuing both 'gluster nfs-ganesha enable', and 'bash -x

        /usr/libexec/ganesha/ganesha-ha.sh --setup'. In both scenarios,

        i still

        don't see the *hostname*-trigger_ip-1 got created.

        below is my ganesha-ha.conf

        HA_NAME="ganesha-ha-01"

        HA_VOL_SERVER="vm-fusion1"

        HA_CLUSTER_NODES="vm-fusion1,vm-fusion3"

        VIP_vm-fusion1="192.168.30.211"

        VIP_vm-fusion3="192.168.30.213"

        _______________________________________________

        Gluster-users mailing list

        Gluster-users@xxxxxxxxxxx <mailto:Gluster-users@gluster.org>

        http://lists.gluster.org/mailman/listinfo/gluster-users

        <http://lists.gluster.org/mailman/listinfo/gluster-users>

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users