Re: In a replica 2 server, file-updates on one server missing on the other server #Personal#

Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> · Fri, 23 Jan 2015 13:28:25 +0530



    On 01/22/2015 02:07 PM, A Ghoshal
      wrote:

    
    Hi
        Pranith,
      

      Yes, the very
        same (chalcogen_eg_oxygen@xxxxxxxxx).
        Justin Clift sent me a mail a while back telling me that it is
        better if
        we all use our business email addresses so I made me a new
        profile. 
      

      Glusterfs
        complains
        about /proc/sys/net/ipv4/ip_local_reserved_ports because we use
        a really
        old Linux kernel (2.6.34) wherein this feature is not present.
        We plan
        to upgrade our Linux so often but each time we are dissuaded
        from it by
        some compatibility issue or the other. So, we get this log every
        time -
        on both good volumes and bad ones. What bothered me was this (on
        serv1):
      

    Basically to make the connections to servers i.e. bricks clients
    need to choose secure ports i.e. port less than 1024. Since this
    file is not present, it is not binding to any port as per the code I
    just checked. There is an option called client-bind-insecure which
    bypasses this check. I feel that is one (probably only way) to get
    around this. 

    You have to "volume set server.allow-insecure on" option and
    bind-insecure option.

    CC ndevos who seemed to have helped someone set bind-insecure option
    correctly here (http://irclog.perlgeek.de/gluster/2014-04-09/text)

    
    Pranith

    
      [2015-01-20 09:37:49.151744] T
        [rpc-clnt.c:1182:rpc_clnt_record_build_header]
        0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96
      

        [2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
        0-rpc-clnt:
        submitted request (XID: 0x39620x Program: GlusterFS 3.3,
        ProgVers: 330,
        Proc: 27) to rpc-transport (replicated_vol-client-0)
      

        [2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
        0-replicated_vol-client-1:
        Auth Info: pid: 7599, uid: 0, gid: 0, owner: 0000000000000000
      

        [2015-01-20 09:37:49.151824] T
        [rpc-clnt.c:1182:rpc_clnt_record_build_header]
        0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96
      

        [2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
        0-rpc-clnt:
        submitted request (XID: 0x39563x Program: GlusterFS 3.3,
        ProgVers: 330,
        Proc: 27) to rpc-transport (replicated_vol-client-1)
      

        [2015-01-20 09:37:49.152239] T
        [rpc-clnt.c:669:rpc_clnt_reply_init] 0-replicated_vol-client-1:
        received rpc message (RPC XID: 0x39563x Program: GlusterFS 3.3,
        ProgVers:
        330, Proc: 27) from rpc-transport (replicated_vol-client-1)
      

        [2015-01-20 09:37:49.152484] T
        [rpc-clnt.c:669:rpc_clnt_reply_init] 0-replicated_vol-client-0:
        received rpc message (RPC XID: 0x39620x Program: GlusterFS 3.3,
        ProgVers:
        330, Proc: 27) from rpc-transport (replicated_vol-client-0)
      
      
      When I write on
        the
        good server (serv1), we see that an RPC request is sent to both
        client-0
        and client-1. While, when I write on the bad server (serv0), the
        RPC request
        is sent only to client-0, which is why it is no wonder that the
        writes
        are not synced over to serv1. Somehow I could not make the
        daemon on serv0
        understand that there are two up-children and not just one. 
      

      One additional
        detail
        - since we are using a kernel that's too old, we do not have the
        (Anand
        Avati's?) FUse readdirplus patches, either. I've noticed that
        the fixes
        in the readdirplus version of glusterfs aren't always guaranteed
        to be
        present on the non-readdirplus version of the patches. I'd filed
        a bug
        around one such anomaly back, but never got around to writing a
        patch for
        it (sorry!) Here it is: https://bugzilla.redhat.com/show_bug.cgi?id=1062287
      

    I don't this has anything to do with readdirplus.

    
      Maybe something
        on similar
        lines here?
      

      Thanks,
      

      Anirban
      

      P.s. Please
        ignore the
        #Personal# in the subject line - we need to do that to push
        mails to the
        public domain past the email filter safely.
      

      From:      
         Pranith Kumar Karampuri
        <pkarampu@xxxxxxxxxx>
      

      To:      
         A Ghoshal
        <a.ghoshal@xxxxxxx>,
        gluster-users@xxxxxxxxxxx
      

      Date:      
         01/22/2015 12:09 AM
      

      Subject:    
           Re: 
        In a replica 2 server, file-updates on one server missing on the
        other
        server
      

      hi,

           Responses inline.

        
        PS: You are chalkogen_oxygen?

        
        Pranith
      

      On 01/20/2015 05:34 PM, A Ghoshal wrote:
      

      Hello,
        

        I am using the following replicated volume:
        

        root@serv0:~> gluster v info replicated_vol 

      
        Volume Name: replicated_vol 

        Type: Replicate 

        Volume ID: 26d111e3-7e4c-479e-9355-91635ab7f1c2 

        Status: Started 

        Number of Bricks: 1 x 2 = 2 

        Transport-type: tcp 

        Bricks: 

        Brick1: serv0:/mnt/bricks/replicated_vol/brick 

        Brick2: serv1:/mnt/bricks/replicated_vol/brick 

        Options Reconfigured: 

        diagnostics.client-log-level: INFO 

        network.ping-timeout: 10 

        nfs.enable-ino32: on 

        cluster.self-heal-daemon: on 

        nfs.disable: off 

      
        replicated_vol is mounted at /mnt/replicated_vol on both serv0
        and serv1.
        If I do the following on serv0: 

      
        root@serv0:~>echo "cranberries" >
        /mnt/replicated_vol/testfile
      

        root@serv0:~>echo "tangerines" >>
        /mnt/replicated_vol/testfile
        

        And then I check for the state of the replicas in the bricks,
        then I find
        that 

      
        root@serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
      

        cranberries 

        tangerines 

        root@serv0:~> 

      
        root@serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
      

        root@serv1:~> 

      
        As may be seen, the replica on serv1 is blank, when I write into
        testfile
        from serv0 (even though the file is created on both bricks).
        Interestingly,
        if I write something to the file at serv1, then the two replicas
        become
        identical. 

      
        root@serv1:~>echo "artichokes" >>
        /mnt/replicated_vol/testfile
        

        root@serv1:~>cat /mnt/bricks/replicated_vol/brick/testfile
      

        cranberries 

        tangerines 

        artichokes 

        root@serv1:~> 

      
        root@serv0:~>cat /mnt/bricks/replicated_vol/brick/testfile
      

        cranberries 

        tangerines 

        artichokes 

        root@serv0:~> 

      
        So, I dabbled into the logs a little bit, after upping the
        diagnostic level,
        and this is what I saw: 

      
            When I write on serv0 (bad case): 

      
        [2015-01-20 09:21:52.197704] T
        [fuse-bridge.c:546:fuse_lookup_resume] 0-glusterfs-fuse:
        53027: LOOKUP /testfl(f0a76987-8a42-47a2-b027-a823254b736b)
      

        [2015-01-20 09:21:52.197959] D
        [afr-common.c:131:afr_lookup_xattr_req_prepare]
        0-replicated_vol-replicate-0: /testfl: failed to get the gfid
        from dict
      

        [2015-01-20 09:21:52.198006] T [rpc-clnt.c:1302:rpc_clnt_record]
        0-replicated_vol-client-0:
        Auth Info: pid: 28151, uid: 0, gid: 0, owner: 0000000000000000
      

        [2015-01-20 09:21:52.198024] T
        [rpc-clnt.c:1182:rpc_clnt_record_build_header]
        0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96
      

        [2015-01-20 09:21:52.198108] T [rpc-clnt.c:1499:rpc_clnt_submit]
        0-rpc-clnt:
        submitted request (XID: 0x78163x Program: GlusterFS 3.3,
        ProgVers: 330,
        Proc: 27) to rpc-transport (replicated_vol-client-0)
      

        [2015-01-20 09:21:52.198565] T
        [rpc-clnt.c:669:rpc_clnt_reply_init] 0-replicated_vol-client-0:
        received rpc message (RPC XID: 0x78163x Program: GlusterFS 3.3,
        ProgVers:
        330, Proc: 27) from rpc-transport (replicated_vol-client-0)
      

        [2015-01-20 09:21:52.198640] D
        [afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
        0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
      

        [2015-01-20 09:21:52.198669] D
        [afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
        0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
      

        [2015-01-20 09:21:52.198681] D
        [afr-self-heal-common.c:887:afr_mark_sources]
        0-replicated_vol-replicate-0: Number of sources: 1
      

        [2015-01-20 09:21:52.198694] D
        [afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
        0-replicated_vol-replicate-0: returning read_child: 0
      

        [2015-01-20 09:21:52.198705] D
        [afr-common.c:1380:afr_lookup_select_read_child]
        0-replicated_vol-replicate-0: Source selected as 0 for /testfl
      

        [2015-01-20 09:21:52.198720] D
        [afr-common.c:1117:afr_lookup_build_response_params]
        0-replicated_vol-replicate-0: Building lookup response from 0
      

        [2015-01-20 09:21:52.198732] D
        [afr-common.c:1732:afr_lookup_perform_self_heal]
        0-replicated_vol-replicate-0: Only 1 child up - do not attempt
        to detect
        self heal 

      
            When I write on serv1 (good case): 

      
        [2015-01-20 09:37:49.151506] T
        [fuse-bridge.c:546:fuse_lookup_resume] 0-glusterfs-fuse:
        31212: LOOKUP /testfl(f0a76987-8a42-47a2-b027-a823254b736b)
      

        [2015-01-20 09:37:49.151683] D
        [afr-common.c:131:afr_lookup_xattr_req_prepare]
        0-replicated_vol-replicate-0: /testfl: failed to get the gfid
        from dict
      

        [2015-01-20 09:37:49.151726] T [rpc-clnt.c:1302:rpc_clnt_record]
        0-replicated_vol-client-0:
        Auth Info: pid: 7599, uid: 0, gid: 0, owner: 0000000000000000
      

        [2015-01-20 09:37:49.151744] T
        [rpc-clnt.c:1182:rpc_clnt_record_build_header]
        0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96
      

        [2015-01-20 09:37:49.151780] T [rpc-clnt.c:1499:rpc_clnt_submit]
        0-rpc-clnt:
        submitted request (XID: 0x39620x Program: GlusterFS 3.3,
        ProgVers: 330,
        Proc: 27) to rpc-transport (replicated_vol-client-0)
      

        [2015-01-20 09:37:49.151810] T [rpc-clnt.c:1302:rpc_clnt_record]
        0-replicated_vol-client-1:
        Auth Info: pid: 7599, uid: 0, gid: 0, owner: 0000000000000000
      

        [2015-01-20 09:37:49.151824] T
        [rpc-clnt.c:1182:rpc_clnt_record_build_header]
        0-rpc-clnt: Request fraglen 456, payload: 360, rpc hdr: 96
      

        [2015-01-20 09:37:49.151889] T [rpc-clnt.c:1499:rpc_clnt_submit]
        0-rpc-clnt:
        submitted request (XID: 0x39563x Program: GlusterFS 3.3,
        ProgVers: 330,
        Proc: 27) to rpc-transport (replicated_vol-client-1)
      

        [2015-01-20 09:37:49.152239] T
        [rpc-clnt.c:669:rpc_clnt_reply_init] 0-replicated_vol-client-1:
        received rpc message (RPC XID: 0x39563x Program: GlusterFS 3.3,
        ProgVers:
        330, Proc: 27) from rpc-transport (replicated_vol-client-1)
      

        [2015-01-20 09:37:49.152484] T
        [rpc-clnt.c:669:rpc_clnt_reply_init] 0-replicated_vol-client-0:
        received rpc message (RPC XID: 0x39620x Program: GlusterFS 3.3,
        ProgVers:
        330, Proc: 27) from rpc-transport (replicated_vol-client-0)
      

        [2015-01-20 09:37:49.152582] D
        [afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
        0-replicated_vol-replicate-0: pending_matrix: [ 0 3 ]
      

        [2015-01-20 09:37:49.152596] D
        [afr-self-heal-common.c:138:afr_sh_print_pending_matrix]
        0-replicated_vol-replicate-0: pending_matrix: [ 0 0 ]
      

        [2015-01-20 09:37:49.152621] D
        [afr-self-heal-common.c:887:afr_mark_sources]
        0-replicated_vol-replicate-0: Number of sources: 1
      

        [2015-01-20 09:37:49.152633] D
        [afr-self-heal-data.c:825:afr_lookup_select_read_child_by_txn_type]
        0-replicated_vol-replicate-0: returning read_child: 0
      

        [2015-01-20 09:37:49.152644] D
        [afr-common.c:1380:afr_lookup_select_read_child]
        0-replicated_vol-replicate-0: Source selected as 0 for /testfl
      

        [2015-01-20 09:37:49.152657] D
        [afr-common.c:1117:afr_lookup_build_response_params]
        0-replicated_vol-replicate-0: Building lookup response from 0
        

        We see that when you write on serv1, the RPC request is sent to
        both replicated_vol-client-0
        and replicated_vol-client-1, while when we write on serv0, the
        request
        is sent only to replicated_vol-client-0, and the FUse client is
        unaware
        of the presence of client-1 in the latter case. 

      
        I checked a bit more in the logs. When I turn on my trace, I
        found many
        instances of these logs on serv0 but NOT on serv1:
        

        [2015-01-20 09:21:15.520784] T [fuse-bridge.c:681:fuse_attr_cbk]
        0-glusterfs-fuse:
        53011: LOOKUP() / => 1 

        [2015-01-20 09:21:17.683088] T
        [rpc-clnt.c:422:rpc_clnt_reconnect] 0-replicated_vol-client-1:
        attempting reconnect 

        [2015-01-20 09:21:17.683159] D
        [name.c:155:client_fill_address_family]
        0-replicated_vol-client-1: address-family not specified,
        guessing it to
        be inet from (remote-host: serv1) 

        [2015-01-20 09:21:17.683178] T
        [name.c:225:af_inet_client_get_remote_sockaddr]
        0-replicated_vol-client-1: option remote-port missing in volume
        replicated_vol-client-1.
        Defaulting to 24007 

        [2015-01-20 09:21:17.683191] T
        [common-utils.c:188:gf_resolve_ip6] 0-resolver:
        flushing DNS cache 

        [2015-01-20 09:21:17.683202] T
        [common-utils.c:195:gf_resolve_ip6] 0-resolver:
        DNS cache not present, freshly probing hostname: serv1
      

        [2015-01-20 09:21:17.683814] D
        [common-utils.c:237:gf_resolve_ip6] 0-resolver:
        returning ip-192.168.24.81 (port-24007) for hostname: serv1 and
        port: 24007
      

        [2015-01-20 09:21:17.684139] D
        [common-utils.c:257:gf_resolve_ip6] 0-resolver:
        next DNS query will return: ip-192.168.24.81 port-24007
      

        [2015-01-20 09:21:17.684164] T [socket.c:731:__socket_nodelay]
        0-replicated_vol-client-1:
        NODELAY enabled for socket 10 

        [2015-01-20 09:21:17.684177] T [socket.c:790:__socket_keepalive]
        0-replicated_vol-client-1:
        Keep-alive enabled for socket 10, interval 2, idle: 20
      

        [2015-01-20 09:21:17.684236] W
        [common-utils.c:2247:gf_get_reserved_ports]
        0-glusterfs: could not open the file
        /proc/sys/net/ipv4/ip_local_reserved_ports
        for getting reserved ports info (No such file or directory)
      

        [2015-01-20 09:21:17.684253] W
        [common-utils.c:2280:gf_process_reserved_ports]
        0-glusterfs: Not able to get reserved ports, hence there is a
        possibility
        that glusterfs may consume reserved port 
      

      Logs above suggest that mount process couldn't
        assign
        a reserved port because it couldn't find the file /proc/sys/net/ipv4/ip_local_reserved_ports

        
        I guess reboot of the machine fixed it. Wonder why it was not
        found in
        the first place.

        
        Pranith.
      

      [2015-01-20 09:21:17.684660] D
        [socket.c:605:__socket_shutdown]
        0-replicated_vol-client-1: shutdown() returned -1. Transport
        endpoint is
        not connected 

        [2015-01-20 09:21:17.684699] T
        [rpc-clnt.c:519:rpc_clnt_connection_cleanup]
        0-replicated_vol-client-1: cleaning up state in transport object
        0x68a630
      

        [2015-01-20 09:21:17.684731] D [socket.c:486:__socket_rwv]
        0-replicated_vol-client-1:
        EOF on socket 

        [2015-01-20 09:21:17.684750] W [socket.c:514:__socket_rwv]
        0-replicated_vol-client-1:
        readv failed (No data available) 

        [2015-01-20 09:21:17.684766] D
        [socket.c:1962:__socket_proto_state_machine]
        0-replicated_vol-client-1: reading from socket failed. Error (No
        data available),
        peer (192.168.24.81:49198) 

      
        I could not find a 'remote-port' option in /var/lib/glusterd on
        either
        peer. Could somebody tell me where this configuration is looked
        up from?
        Also, sometime later, I rebooted serv0 and that seemed to solve
        the problem.
        However, stop+start of replicated_vol and restart of
        /etc/init.d/glusterd
        did NOT solve the problem. 
      

      Ignore that log. If no port is given in that
        volfile,
        it picks 24007 as the port, which is the default port where
        glusterd 'listens'

      
        Any help on this matter will be greatly appreciated as I need to
        provide
        robustness assurances for our setup. 

      
        Thanks a lot, 

        Anirban 

      
        P.s. Additional details: 

          glusterfs version: 3.4.2 

          Linux kernel version: 2.6.34 
      =====-----=====-----=====

          Notice: The information contained in this e-mail

          message and/or attachments to it may contain 

          confidential or privileged information. If you are 

          not the intended recipient, any dissemination, use, 

          review, distribution, printing or copying of the 

          information contained in this e-mail message 

          and/or attachments to it are strictly prohibited. If 

          you have received this communication in error, 

          please notify us by reply e-mail or telephone and 

          immediately and permanently delete the message 

          and any attachments. Thank you
      
      
        _______________________________________________

            Gluster-users mailing list

          Gluster-users@xxxxxxxxxxx

          http://www.gluster.org/mailman/listinfo/gluster-users
        

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users