Re: Mount problems when secondary node down

A F <alex@xxxxxxxxxx> · Mon, 17 Nov 2014 11:57:30 +0000



    Hi,

    
    So, can anyone try and reproduce this problem? I've downgraded to
    v3.5.2, which I'm using in prod, and I get the same behavior.

    Steps to reproduce:

    1. probe server2, create and start volume

    2. do not mount volume

    3. reboot/poweroff server2; or add server1 to its iptables (with -j
    DROP, not -j REJECT)

    4. on server1 (while server2 is rebooting or dropping traffic from
    server1): time mount -t glusterfs server1:/volume /some/path

    
    PS: with -j REJECT it mounts instantly. with -j DROP it always waits
    2mins 7secs

    Thanks!

     
    On 11/11/2014 01:19, Pranith Kumar
      Karampuri wrote:

    
      On 11/10/2014 11:47 PM, A F wrote:

      
      Hello,

        
        I have two servers, 192.168.0.10 and 192.168.2.10. I'm using
        gluster 3.6.1 (installed from gluster repo) on AWS Linux. Both
        servers are completely reachable in LAN. 

        # rpm -qa|grep gluster 

        glusterfs-3.6.1-1.el6.x86_64 

        glusterfs-server-3.6.1-1.el6.x86_64 

        glusterfs-libs-3.6.1-1.el6.x86_64 

        glusterfs-api-3.6.1-1.el6.x86_64 

        glusterfs-cli-3.6.1-1.el6.x86_64 

        glusterfs-fuse-3.6.1-1.el6.x86_64 

        
        These are the commands I ran: 

        # gluster peer probe 192.168.2.10 

        # gluster volume create aloha replica 2 transport tcp
        192.168.0.10:/var/aloha 192.168.2.10:/var/aloha force 

        # gluster volume start aloha 

        # gluster volume set aloha network.ping-timeout 5 

        # gluster volume set aloha nfs.disable on 

        
        Problem number 1: 

        tail -f /var/log/glusterfs/etc-glusterfs-glusterd.vol.log shows
        log cluttering with: 

        [2014-11-10 17:41:26.328796] W [socket.c:611:__socket_rwv]
        0-management: readv on
        /var/run/38c520c774793c9cdae8ace327512027.socket failed (Invalid
        argument) 

        this happens every 3 seconds on both servers. It is related to
        NFS and probably rpcbind, but I absolutely want them disabled.
        As you see, I've set gluster to disable nfs - why doesn't it
        keep quiet about it then? 

        
        Problem number 2: 

        in fstab on server 192.168.0.10:   192.168.0.10:/aloha
        /var/www/hawaii      glusterfs       defaults,_netdev        0 0
        

        in fstab on server 192.168.2.10:   192.168.2.10:/aloha
        /var/www/hawaii      glusterfs       defaults,_netdev        0 0
        

        If I shutdown one of the servers (192.168.2.10), and I reboot
        the remaining one (192.168.0.10), it won't come up as fast as it
        should. It lags a few minutes waiting for gluster. After it
        eventually starts, mount point is not mounted and volume is
        stopped: 

        # gluster volume status 

        Status of volume: aloha 

        Gluster process                                         Port
        Online  Pid 

        ------------------------------------------------------------------------------

        
        Brick 192.168.0.10:/var/aloha                           N/A
        N       N/A 

        Self-heal Daemon on localhost                           N/A
        N       N/A 

        
        Task Status of Volume aloha 

        ------------------------------------------------------------------------------

        
        There are no active volume tasks 

        
        This didn't happen before, so fine, I first have to stop the
        volume and then start it again. It now shows as online: 

        Brick 192.168.0.10:/var/aloha                           49155
        Y       3473 

        Self-heal Daemon on localhost                           N/A
        Y       3507 

        
        # time mount -a 

        real    2m7.307s 

        
        # time mount -t glusterfs 192.168.0.10:/aloha /var/www/hawaii 

        real    2m7.365s 

        
        # strace mount -t glusterfs 192.168.0.10:/aloha /var/www/hawaii
        

        (attached) 

        
        # tail /var/log/glusterfs/* -f|grep -v readv 

        (attached) 

        
        I've done this setup before, so I'm amazed it doesn't work. I
        even have it in production at the moment, with the same options
        and setup, and for example I'm not getting readv errors. I'm
        unable to test the mount part though, but I feel I have covered
        it way back when I was testing the environment. 

        Any help is kindly appreciated. 

      
      CC glusterd folks

      
      Pranith

      
        _______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users
      
      
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users