Re: Peer Rejected(Connected) and Self heal daemon is not running causing split brain

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi atin,

I have tried to flush the iptables and this time i managed to get the peer into cluster. However, the self heal daemon is still offline and im unable to bring the daemon back online on gfs2. Doing a heal on either server gives me a succesful output but when i check the heal info i am getting many split brain errors on gfs2

Thank You Kindly,
Kaamesh


On Thu, Feb 26, 2015 at 5:40 PM, Atin Mukherjee <amukherj@xxxxxxxxxx> wrote:
Could you check the N/W firewall setting? Flush iptable setting using
iptables -F and retry.

~Atin

On 02/26/2015 02:55 PM, Kaamesh Kamalaaharan wrote:
> Hi guys,
>
> I managed to get gluster running but im having a couple of issues with my
> setup 1) my peer status is rejected but connected 2) my self heal daemon is
> not running on one server and im getting split brain files.
> My setup is two gluster volumes  (gfs1 and gfs2) on replicate each with a
> brick
>
> 1) My peer status doesnt go into Peer in Cluster. running a peer status
> command gives me State:Peer Rejected (Connected) . At this point, the brick
> on gfs2 does not go online and i get this output
>
>
> #gluster volume status
>
> Status of volume: gfsvolume
>
> Gluster process Port Online Pid
>
> ------------------------------------------------------------------------------
>
> Brick gfs1:/export/sda/brick 49153 Y 15025
>
> NFS Server on localhost 2049 Y 15039
>
> Self-heal Daemon on localhost N/A Y 15044
>
>
>
> Task Status of Volume gfsvolume
>
> ------------------------------------------------------------------------------
>
> There are no active volume tasks
>
>
>
> I have followed the methods used in one of the threads and performed the
> following
>
>    a) stop glusterd
>    b) rm all files in /var/lib/glusterd/  except for glusterd.info
>    c) start glusterd and probe gfs1 from gfs2 and peer status which gives me
>
>
> # gluster peer status
>
> Number of Peers: 1
>
>
> Hostname: gfs1
>
> Uuid: 49acc9c2-4809-4da5-a6f0-6a3d48314070
>
> State: Sent and Received peer request (Connected)
>
>
> the same thread mentioned that changing the status of the peer in
> /var/lib/glusterd/peer/{UUID} from status=5 to status=3 fixes this and on
> restart of gfs1 the peer status goes to
>
> #gluster peer status
>
> Number of Peers: 1
>
>
> Hostname: gfs1
>
> Uuid: 49acc9c2-4809-4da5-a6f0-6a3d48314070
>
> State: Peer in Cluster (Connected)
>
> This fixes the connection between the peers and the volume status shows
>
>
> Status of volume: gfsvolume
>
> Gluster process Port Online Pid
>
> ------------------------------------------------------------------------------
>
> Brick gfs1:/export/sda/brick 49153 Y 10852
>
> Brick gfs2:/export/sda/brick 49152 Y 17024
>
> NFS Server on localhost N/A N N/A
>
> Self-heal Daemon on localhost N/A N N/A
>
> NFS Server on gfs2 N/A N N/A
>
> Self-heal Daemon on gfs2 N/A N N/A
>
>
>
> Task Status of Volume gfsvolume
>
> ------------------------------------------------------------------------------
>
> There are no active volume tasks
>
>
> Which brings us to problem 2
>
> 2) My self-heal demon is not alive
>
> I fixed the self heal on gfs1 by running
>
>  #find <gluster-mount> -noleaf -print0 | xargs --null stat >/dev/null
> 2>/var/log/gluster/<gluster-mount>-selfheal.log
>
> and running a volume status command gives me
>
> # gluster volume status
>
> Status of volume: gfsvolume
>
> Gluster process Port Online Pid
>
> ------------------------------------------------------------------------------
>
> Brick gfs1:/export/sda/brick 49152 Y 16660
>
> Brick gfs2:/export/sda/brick 49152 Y 21582
>
> NFS Server on localhost 2049 Y 16674
>
> Self-heal Daemon on localhost N/A Y 16679
>
> NFS Server on gfs2 N/A N 21596
>
> Self-heal Daemon on gfs2 N/A N 21600
>
>
>
> Task Status of Volume gfsvolume
>
> ------------------------------------------------------------------------------
>
> There are no active volume tasks
>
>
>
> However, running this on gfs2 doesnt fix the daemon.
>
> Restarting the gfs2 server brings me back to problem 1 and the cycle
> continues..
>
> Can anyone assist me with this issue(s).. thank you.
>
> Thank You Kindly,
> Kaamesh
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://www.gluster.org/mailman/listinfo/gluster-users
>

--
~Atin

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux