On 06/08/13 21:25, Kaushal M wrote: > Toby, > What versions of gluster are on the peers? And does the cluster have > just two peers or more? Version 3.3.1. The cluster has/had two nodes; we're trying to replace one with another one. > On Tue, Aug 6, 2013 at 4:32 PM, Toby Corkindale > <toby.corkindale at strategicdata.com.au> wrote: >> ----- Original Message ----- >>> From: "Toby Corkindale" <toby.corkindale at strategicdata.com.au> >>> To: gluster-users at gluster.org >>> Sent: Tuesday, 6 August, 2013 6:26:59 PM >>> Subject: Re: peer status rejected (connected) >>> >>> On 06/08/13 18:12, Toby Corkindale wrote: >>>> Hi, >>>> What does it mean when you use "peer probe" to add a new host, but then >>>> afterwards the "peer status" is reported as "Rejected" yet "Connected"? >>>> And of course -- how does one fix this? >>>> >>>> gluster> peer status >>>> Number of Peers: 1 >>>> >>>> Hostname: 192.168.10.32 >>>> Uuid: 32497846-6e02-4b68-b147-6f4b936b3373 >>>> State: Peer Rejected (Connected) >>> >>> It's worth noting that the attempt to probe the peer was listed as >>> successful though: >>> >>> gluster> peer probe mel-storage04 >>> >>> Probe successful >>> gluster> peer status >>> Number of Peers: 1 >>> >>> Hostname: mel-storage04 >>> Uuid: 6254c24d-29d4-4794-8159-3c2b03b34798 >>> State: Peer Rejected (Connected) >>> >> >> >> After searching around some more, I saw that this issue is usually caused by two peers joining, when one has a very out of date volume list. >> And indeed, in the log files I see messages about checksums failing to agree on volumes being exchanged. >> >> The odd thing is, this is a fresh server, running the same version of glusterfs. >> I tried stopping the services entirely, rm -rf /var/lib/glusterfs/*, and then started up again and tried probing that peer -- and received the same Rejection. >> I'm confused as to how it could possibly be getting a different volume checksum, when it didn't even have its own copy. >> >> Does the community have any suggestions about resolving this? >> >> See also, inability to remove or replace bricks in separate message - which might be related, although the errors occur even if run on the cluster without this problematic peer attached at all. >> >> -Toby