Re: Gluster 3.7.6 add new node state Peer Rejected (Connected)

Steve Dainard <sdainard@xxxxxxxx> · Thu, 25 Feb 2016 12:04:11 -0800

For clarity "no return" from 'gluster volume heal <volname> info':
# gluster volume heal vm-storage info
Brick 10.0.231.50:/mnt/lv-vm-storage/vm-storage
Number of entries: 0

Brick 10.0.231.51:/mnt/lv-vm-storage/vm-storage
Number of entries: 0

Brick 10.0.231.52:/mnt/lv-vm-storage/vm-storage
Number of entries: 0

On Thu, Feb 25, 2016 at 12:02 PM, Steve Dainard <sdainard@xxxxxxxx> wrote:
I haven't done anything more than peer thus far, so I'm a bit confused as to how the volume info fits in, can you expand on this a bit?

Failed commits? Is this split brain on the replica volumes? I don't get any return from 'gluster volume heal <volname> info' on all the replica volumes, but if I try a gluster volume heal <volname> full I get: 'Launching heal operation to perform full self heal on volume <volname> has been unsuccessful'.

I have 5 volumes total.

'Replica 3' volumes running on gluster01/02/03:
vm-storage
iso-storage
export-domain-storage
env-modules

And one distributed only volume 'storage' info shown below:

From existing host gluster01/02:
type=0
count=4
status=1
sub_count=0
stripe_count=1
replica_count=1
disperse_count=0
redundancy_count=0
version=25
transport-type=0
volume-id=26d355cb-c486-481f-ac16-e25390e73775
username=eb9e2063-6ba8-4d16-a54f-2c7cf7740c4c
password=
op-version=3
client-op-version=3
quota-version=1
parent_volname=N/A
restored_from_snap=00000000-0000-0000-0000-000000000000
snap-max-hard-limit=256
features.quota-deem-statfs=on
features.inode-quota=on
diagnostics.brick-log-level=WARNING
features.quota=on
performance.readdir-ahead=on
performance.cache-size=1GB
performance.stat-prefetch=on
brick-0=10.0.231.50:-mnt-raid6-storage-storage
brick-1=10.0.231.51:-mnt-raid6-storage-storage
brick-2=10.0.231.52:-mnt-raid6-storage-storage
brick-3=10.0.231.53:-mnt-raid6-storage-storage

From existing host gluster03/04:
type=0
count=4
status=1
sub_count=0
stripe_count=1
replica_count=1
disperse_count=0
redundancy_count=0
version=25
transport-type=0
volume-id=26d355cb-c486-481f-ac16-e25390e73775
username=eb9e2063-6ba8-4d16-a54f-2c7cf7740c4c
password=
op-version=3
client-op-version=3
quota-version=1
parent_volname=N/A
restored_from_snap=00000000-0000-0000-0000-000000000000
snap-max-hard-limit=256
features.quota-deem-statfs=on
features.inode-quota=on
performance.stat-prefetch=on
performance.cache-size=1GB
performance.readdir-ahead=on
features.quota=on
diagnostics.brick-log-level=WARNING
brick-0=10.0.231.50:-mnt-raid6-storage-storage
brick-1=10.0.231.51:-mnt-raid6-storage-storage
brick-2=10.0.231.52:-mnt-raid6-storage-storage
brick-3=10.0.231.53:-mnt-raid6-storage-storage

So far between gluster01/02 and gluster03/04 the configs are the same, although the ordering is different for some of the features.

On gluster05/06 the ordering is different again, and the quota-version=0 instead of 1.

From new hosts gluster05/gluster06:
type=0
count=4
status=1
sub_count=0
stripe_count=1
replica_count=1
disperse_count=0
redundancy_count=0
version=25
transport-type=0
volume-id=26d355cb-c486-481f-ac16-e25390e73775
username=eb9e2063-6ba8-4d16-a54f-2c7cf7740c4c
password=
op-version=3
client-op-version=3
quota-version=0
parent_volname=N/A
restored_from_snap=00000000-0000-0000-0000-000000000000
snap-max-hard-limit=256
performance.stat-prefetch=on
performance.cache-size=1GB
performance.readdir-ahead=on
features.quota=on
diagnostics.brick-log-level=WARNING
features.inode-quota=on
features.quota-deem-statfs=on
brick-0=10.0.231.50:-mnt-raid6-storage-storage
brick-1=10.0.231.51:-mnt-raid6-storage-storage
brick-2=10.0.231.52:-mnt-raid6-storage-storage
brick-3=10.0.231.53:-mnt-raid6-storage-storage

Also, I forgot to mention that when I initially peer'd the two new hosts, glusterd crashed on gluster03 and had to be restarted (log attached) but has been fine since.

Thanks,
Steve

On Thu, Feb 25, 2016 at 11:27 AM, Mohammed Rafi K C <rkavunga@xxxxxxxxxx> wrote:

    On 02/25/2016 11:45 PM, Steve Dainard
      wrote:

      Hello,

        I upgraded from 3.6.6 to 3.7.6 a couple weeks ago. I just peered
        2 new nodes to a 4 node cluster and gluster peer status is:

        # gluster peer status <-- from node gluster01

        Number of Peers: 5

        Hostname: 10.0.231.51

        Uuid: b01de59a-4428-486b-af49-cb486ab44a07

        State: Peer in Cluster (Connected)

        Hostname: 10.0.231.52

        Uuid: 75143760-52a3-4583-82bb-a9920b283dac

        State: Peer in Cluster (Connected)

        Hostname: 10.0.231.53

        Uuid: 2c0b8bb6-825a-4ddd-9958-d8b46e9a2411

        State: Peer in Cluster (Connected)

        Hostname: 10.0.231.54 <-- new node gluster05

        Uuid: 408d88d6-0448-41e8-94a3-bf9f98255d9c

        State: Peer Rejected (Connected)

        Hostname: 10.0.231.55 <-- new node gluster06

        Uuid: 9c155c8e-2cd1-4cfc-83af-47129b582fd3

        State: Peer Rejected (Connected)

    Looks like your configuration files are mismatching, ie the checksum
    calculation differs on this two node than the others,

    Did you had any failed commit ?

    Compare your /var/lib/glusterd/<volname>/info of the failed
    node against good one, mostly you could see some difference.

    can you paste the /var/lib/glusterd/<volname>/info ?

    Regards

    Rafi KC

        I followed the write-up here: http://www.gluster.org/community/documentation/index.php/Resolving_Peer_Rejected
          and the two new nodes peer'd properly but after a reboot of
          the two new nodes I'm seeing the same Peer Rejected
          (Connected) State.

        I've attached logs from an existing node, and the two new
          nodes.

        Thanks for any suggestions,
        Steve

      _______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users