Re: Glusterfs :sync lost between two boards

Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> · Fri, 05 Feb 2016 10:30:48 +0530



    On 02/05/2016 08:45 AM, songxin wrote:

    
        Hi, 

          
          I use glusterfs (version
              3.7.6) in replicate mode for sync between two boards
            in a node.

          
          When one of the board is
            locked and replaced with new board and restarted we see that
            sync is lost between the two boards.The mounted glusterfs
            volume is not present on the replaced board. 

          
          Output of some of gluster
            commands on the replaced board are as below.

          
          002500> gluster volume
            status c_glusterfs 

          Status of volume:
            c_glusterfs 

          Gluster process TCP Port
            RDMA Port Online Pid 

          ------------------------------------------------------------------------------ 

          Brick
            192.32.0.48:/opt/lvmdir/c2/brick 49240 0 Y 1293 

          
          Task Status of Volume
            c_glusterfs 

          ------------------------------------------------------------------------------ 

          There are no active volume
            tasks 

          
          002500> gluster volume
            info 

          
          Volume Name: c_glusterfs 

          Type: Distribute 

          Volume ID:
            3625f7ff-2b92-4ac4-9967-7abf966eceef 

          Status: Started 

          Number of Bricks: 1 

          Transport-type: tcp 

          Bricks: 

          Brick1:
            192.32.0.48:/opt/lvmdir/c2/brick 

          Options Reconfigured: 

          performance.readdir-ahead:
            on 

          network.ping-timeout: 4 

          nfs.disable: on 

          
          In status info , we don't
            see the gluster process of the replaced board. The gluster
            process Brick 192.32.0.48:/opt/lvmdir/c2/brick is of the
            other board which is not replaced , 

          
          Output of the commands on
            the other board are : 

          
          # gluster volume info 

          
          Volume Name: c_glusterfs 

          Type: Distribute 

          Volume ID:
            3625f7ff-2b92-4ac4-9967-7abf966eceef 

          Status: Started 

          Number of Bricks: 1 

          Transport-type: tcp 

          Bricks: 

          Brick1:
            192.32.0.48:/opt/lvmdir/c2/brick 

          Options Reconfigured: 

          performance.readdir-ahead:
            on 

          network.ping-timeout: 4 

          nfs.disable: on 

          
          # gluster peer status 

          Number of Peers: 2 

          
          Hostname: 192.32.1.144 

          Uuid:
            bbe2a458-ad3d-406d-b233-b6027c12174e 

          State: Peer in Cluster
            (Connected) 

          
          Hostname: 192.32.1.144 

          Uuid:
            bbe2a458-ad3d-406d-b233-b6027c12174e 

          State: Peer in Cluster
            (Connected) 

          
          gluster peer status shows
            the same host twice , the gluster process of same is missing
            in the volume info, Also , the command gluster volume status
            c_glusterfs hangs 

          
          From the logs of gluster at
            /var/log/glusterfs , we observed some errors 

          
          cmd_history.log : 

          volume add-brick c_glusterfs
            replica 2 192.32.1.144:/opt/lvmdir/c2/brick force : FAILED :
            Locking failed on 192.32.1.144. Please check log file for
            details. 

          cli.log: 

          [2016-01-30 04:32:40.179381]
            I [cli.c:721:main] 0-cli: Started running gluster with
            version 3.7.6 

          [2016-01-30 04:32:40.191715]
            I [MSGID: 101190]
            [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll:
            Started thread with index 1 

          [2016-01-30 04:32:40.193246]
            I [socket.c:2355:socket_event_handler] 0-transport:
            disconnecting now 

          [2016-01-30 04:32:40.196551]
            I [cli-rpc-ops.c:2465:gf_cli_add_brick_cbk] 0-cli: Received
            resp to add brick 

          [2016-01-30 04:32:40.196684]
            I [input.c:36:cli_batch] 0-: Exiting with: -1 

          
          Can any one help me to
              analyze the reason？
        
      
    I just replied on the bug you raised but this mail has more info. It
    seems like the volume is Distribute volume which means it doesn't
    sync. I also don't understand how you ended up in a situation where
    two of the peers have same uuid and hostname. What are the steps you
    took to get into this situation? What are the two bricks you want to
    be in sync? May be we can help once you give this information.

    
    Pranith

    
          Thanks，
          Xin 
        
        
      _______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
    
    
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users