Re: [Gluster-devel] Messup with peer status!!

ABHISHEK PALIWAL <abhishpaliwal@xxxxxxxxx> · Mon, 14 Mar 2016 14:18:17 +0530

On Mon, Mar 14, 2016 at 12:12 PM, Atin Mukherjee <amukherj@xxxxxxxxxx> wrote:

On 03/14/2016 10:52 AM, ABHISHEK PALIWAL wrote:

> Hi Team,

>

> I am facing some issue with peer status and because of that remove-brick

> on replica volume is getting failed.

>

> Here. is the scenario what I am doing with gluster:

>

> 1. I have two boards A & B and gluster is running on both of the boards.

> 2. On  board I have created a replicated volume with one brick on each

> board.

> 3. Created one glusterfs mount point where both of brick are mounted.

> 4. start the volume with nfs.disable=true.

> 5. Till now everything is in sync between both of bricks.

>

> Now when I manually plug-out the board B from the slot and plug-in it again.

>

> 1. After bootup the board B I have started the glusted on the board B.

>

> Following are the some gluster command output on Board B after the step 1.

>

> # gluster peer status

> Number of Peers: 2

>

> Hostname: 10.32.0.48

> Uuid: f4ebe3c5-b6a4-4795-98e0-732337f76faf

> State: Accepted peer request (Connected)

>

> Hostname: 10.32.0.48

> Uuid: 4bf982c0-b21b-415c-b870-e72f36c7f2e7

> State: Peer is connected and Accepted (Connected)

>

> Why this peer status is showing two peer with different UUID?

GlusterD doesn't generate a new UUID on init if it has already generated

an UUID earlier. This clearly indicates that on reboot of board B

content of /var/lib/glusterd were wiped off. I've asked this question to

you multiple times that is it the case?

Yes I am following the same which is mentioned in the link:

http://www.gluster.org/community/documentation/index.php/Resolving_Peer_Rejected

but why it is showing two peer enteries? 

>

> # gluster volume info

>

> Volume Name: c_glusterfs

> Type: Replicate

> Volume ID: c11f1f13-64a0-4aca-98b5-91d609a4a18d

> Status: Started

> Number of Bricks: 1 x 2 = 2

> Transport-type: tcp

> Bricks:

> Brick1: 10.32.0.48:/opt/lvmdir/c2/brick

> Brick2: 10.32.1.144:/opt/lvmdir/c2/brick

> Options Reconfigured:

> performance.readdir-ahead: on

> network.ping-timeout: 4

> nfs.disable: on

> # gluster volume heal c_glusterfs info

> c_glusterfs: Not able to fetch volfile from glusterd

> Volume heal failed.

> # gluster volume status c_glusterfs

> Status of volume: c_glusterfs

> Gluster process                             TCP Port  RDMA Port  Online

> Pid

> ------------------------------------------------------------------------------

>

> Brick 10.32.1.144:/opt/lvmdir/c2/brick      N/A       N/A        N

> N/A

> Self-heal Daemon on localhost               N/A       N/A        Y

> 3922

>

> Task Status of Volume c_glusterfs

> ------------------------------------------------------------------------------

>

> There are no active volume tasks

> --

>

> At the same time Board A have the following gluster commands outcome:

>

> # gluster peer status

> Number of Peers: 1

>

> Hostname: 10.32.1.144

> Uuid: c6b64e36-76da-4e98-a616-48e0e52c7006

> State: Peer in Cluster (Connected)

>

> Why it is showing the older UUID of host 10.32.1.144 when this UUID has

> been changed and new UUID is 267a92c3-fd28-4811-903c-c1d54854bda9

>

>

> # gluster volume heal c_glusterfs info

> c_glusterfs: Not able to fetch volfile from glusterd

> Volume heal failed.

> # gluster volume status c_glusterfs

> Status of volume: c_glusterfs

> Gluster process                             TCP Port  RDMA Port  Online

> Pid

> ------------------------------------------------------------------------------

>

> Brick 10.32.0.48:/opt/lvmdir/c2/brick       49169     0          Y

> 2427

> Brick 10.32.1.144:/opt/lvmdir/c2/brick      N/A       N/A        N

> N/A

> Self-heal Daemon on localhost               N/A       N/A        Y

> 3388

> Self-heal Daemon on 10.32.1.144             N/A       N/A        Y

> 3922

>

> Task Status of Volume c_glusterfs

> ------------------------------------------------------------------------------

>

> There are no active volume tasks

>

> As you see in the "gluster volume status" showing that Brick

> "10.32.1.144:/opt/lvmdir/c2/brick " is offline so We have tried to

> remove it but getting "volume remove-brick c_glusterfs replica 1

> 10.32.1.144:/opt/lvmdir/c2/brick force : FAILED : Incorrect brick

> 10.32.1.144:/opt/lvmdir/c2/brick for volume c_glusterfs" error on the

> Board A.

>

> Please reply on this post because I am always getting this error in this

> scenario.

>

> For more detail I am also adding the logs of both of the board which

> having some manual created file in which you can find the output of

> glulster command from both of the boards

>

> in logs

> 00030 is board A

> 00250 is board B.

This attachment doesn't help much. Could you attach full glusterd log

files from both the nodes?

>
inside this attachment you will found full glusterd log file 00300/glusterd/ and 002500/glusterd/ 

> Thanks in advance waiting for the reply.

>

> Regards,

> Abhishek

>

>

> Regards

> Abhishek Paliwal

>

>

> _______________________________________________

> Gluster-devel mailing list

> Gluster-devel@xxxxxxxxxxx

> http://www.gluster.org/mailman/listinfo/gluster-devel

>

-- 

Regards

Abhishek Paliwal

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users