Re: One node goes offline, the other node loses its connection to its local Gluster volume

Joe Julian <joe@xxxxxxxxxxxxxxxx> · Tue, 11 Mar 2014 11:39:49 -0700



    On 03/06/2014 04:48 PM, Greg Scott
      wrote:

    
        In your real-life concern, the interconnect would not interfere with the existence of either 
machines' ip address so after the ping-timeout, operations would resume in a split-brain 
configuration. As long as no changes were made to the same file on both volumes, when the 
connection is reestablished, the self-heal will do exactly what you expect.

      
      Except that's not what happens.  If I ifdown that interconnect NIC, I should see the file system after 42 seconds, right?

    
    No.

    
    Lets take a look at an imaginary volume:

    
    # gluster volume info foo

      Volume Name: foo

      Type: Replicate

      Volume ID: f8577cab-9ea9-411f-9b85-97c93b1ba7df

      Status: Started

      Number of Bricks: 1 x 2 = 2

      Transport-type: tcp

      Bricks:

      Brick1: server1:/mnt/1/brick

      Brick2: server2:/mnt/1/brick

      
      # ping -c1 server1 | grep server1

      PING server1.domain.dom (192.168.0.1) 56(84) bytes of
        data.

      # ping -c1 server2 | grep server2

      
        PING server2.domain.dom (192.168.0.2) 56(84) bytes of data.

      
    Each server mounts its volume from localhost using an fstab entry
    like "localhost:foo /mnt/foo glusterfs _netdev 0 0".

    
    What this actually does is contact glusterd on port 24007 at
    localhost and ask for the volume definition for foo. Upon receiving
    that, the client then connects directly with the brick servers on
    whatever port they have assigned at the resolved ip address for each
    hostname. In this scenario, the client will connect to both
      server1 and server2 at 192.168.0.1 an 192.168.0.2 respectively.

    
    Now, on server1 we down the interface. 192.168.0.1 no longer
      exists! The route to 192.168.0.2 no longer exists. The client
      can now connect to neither server.

    
    This is different from someone pulling a plug. If someone pulls the
    plug, 192.168.0.1 will still exist! The client will still be
    able to access the mounted volume through that address even though
    it can no longer reach the replica at 192.168.0.2.

    
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users