Hello, On 03/23/2016 06:35 PM, Ravishankar N wrote: > On 03/23/2016 09:53 PM, Marian Marinov wrote: >>> >What version of gluster is this? >> 3.7.6 >> >>> >Do you observe the problem even when only the 4th 'non data' server >>> comes up? In that case it is unlikely that self-heal is the issue. >> No >> >>> >Are the clients using FUSE or NFS mounts? >> FUSE >> > > Okay, when the you say the cluster stalls, I'm assuming the apps using > files via the fuse mount are stalled. Does the mount log contain > messages about completing selfheals on files when the mount eventually > becomes responsive?If yes, you could try setting > 'cluster.data-self-heal' to off. Yes we have many lines with similar entries in the logs: [2016-03-22 11:10:23.398668] I [MSGID: 108026] [afr-self-heal-common.c:651:afr_log_selfheal] 0-share-replicate-0: Completed data selfheal on b18c2b05-7186-4c22-ab34-24858b1153e5. source=0 sinks=2 [2016-03-23 13:11:54.110773] I [MSGID: 108026] [afr-self-heal-common.c:651:afr_log_selfheal] 0-share-replicate-0: Completed metadata selfheal on 591d2bee-b55c-4dd6-a1bc-8b7fc5571caa. source=0 sinks= We already tested setting cluster.self-heal-daemon off and we did not experience the issue in this case. We stopped one node, disabled self-heal-daemon, started the node and later enabled the self-heal-daemon. There was no "stalling" in this case. We will try the suggested setting too. -- Dimitar Ianakiev System Administrator www.siteground.com
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users