Peer Rejected(Connected) and Self heal daemon is not running causing split brain

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi guys,

I managed to get gluster running but im having a couple of issues with my setup 1) my peer status is rejected but connected 2) my self heal daemon is not running on one server and im getting split brain files.
My setup is two gluster volumes  (gfs1 and gfs2) on replicate each with a brick

1) My peer status doesnt go into Peer in Cluster. running a peer status command gives me State:Peer Rejected (Connected) . At this point, the brick on gfs2 does not go online and i get this output

#gluster volume status
Status of volume: gfsvolume
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick gfs1:/export/sda/brick 49153 Y 15025
NFS Server on localhost 2049 Y 15039
Self-heal Daemon on localhost N/A Y 15044
 
Task Status of Volume gfsvolume
------------------------------------------------------------------------------
There are no active volume tasks


I have followed the methods used in one of the threads and performed the following
   a) stop glusterd
   b) rm all files in /var/lib/glusterd/  except for glusterd.info
   c) start glusterd and probe gfs1 from gfs2 and peer status which gives me

# gluster peer status
Number of Peers: 1

Hostname: gfs1
Uuid: 49acc9c2-4809-4da5-a6f0-6a3d48314070
State: Sent and Received peer request (Connected)

the same thread mentioned that changing the status of the peer in /var/lib/glusterd/peer/{UUID} from status=5 to status=3 fixes this and on restart of gfs1 the peer status goes to

#gluster peer status
Number of Peers: 1

Hostname: gfs1
Uuid: 49acc9c2-4809-4da5-a6f0-6a3d48314070
State: Peer in Cluster (Connected)

This fixes the connection between the peers and the volume status shows

Status of volume: gfsvolume
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick gfs1:/export/sda/brick 49153 Y 10852
Brick gfs2:/export/sda/brick 49152 Y 17024
NFS Server on localhost N/A N N/A
Self-heal Daemon on localhost N/A N N/A
NFS Server on gfs2 N/A N N/A
Self-heal Daemon on gfs2 N/A N N/A
 
Task Status of Volume gfsvolume
------------------------------------------------------------------------------
There are no active volume tasks


Which brings us to problem 2

2) My self-heal demon is not alive

I fixed the self heal on gfs1 by running 

 #find <gluster-mount> -noleaf -print0 | xargs --null stat >/dev/null 2>/var/log/gluster/<gluster-mount>-selfheal.log

and running a volume status command gives me

# gluster volume status
Status of volume: gfsvolume
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick gfs1:/export/sda/brick 49152 Y 16660
Brick gfs2:/export/sda/brick 49152 Y 21582
NFS Server on localhost 2049 Y 16674
Self-heal Daemon on localhost N/A Y 16679
NFS Server on gfs2 N/A N 21596
Self-heal Daemon on gfs2 N/A N 21600
 
Task Status of Volume gfsvolume
------------------------------------------------------------------------------
There are no active volume tasks

 
However, running this on gfs2 doesnt fix the daemon.

Restarting the gfs2 server brings me back to problem 1 and the cycle continues.. 

Can anyone assist me with this issue(s).. thank you.

Thank You Kindly,
Kaamesh

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux