Hi Pranith, On Tue, Jun 3, 2014 at 10:56 AM, Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> wrote: > > > ----- Original Message ----- >> From: "Andrew Lau" <andrew@xxxxxxxxxxxxxx> >> To: "gluster-users@xxxxxxxxxxx List" <gluster-users@xxxxxxxxxxx> >> Sent: Tuesday, June 3, 2014 4:10:25 AM >> Subject: Brick on just one host constantly going offline >> >> Hi, >> >> Just a short post as I've since nuked the test environment. >> >> I've had this case where in a 2 node gluster replica, the brick of the >> first host is constantly going offline. >> >> gluster volume status >> >> would report host 1's brick is offline. The quorum would kick in, >> putting the whole cluster into a read only state. This has only >> recently been happening w/ gluster 3.5 and it normally happens after >> about 3-4 days of 500GB or so data transfer. > > Could you check mount logs to see if there are ping timer expiry messages for disconnects? > If you see them, then it is very likely that you are hitting throttling problem fixed by http://review.gluster.org/7531 > Ah, that makes sense as it was the only volume which had that ping timeout setting. I also did see the timeout messages in the logs when I was checking. So is this merged in 3.5.1 ? > Pranith > >> >> Has anyone noticed this before? The only way to bring it back was to: >> >> killall glusterfsd ; killall -9 glusterfsd ; killall glusterd ; glusterd >> >> >> Thanks, >> Andrew >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users@xxxxxxxxxxx >> http://supercolony.gluster.org/mailman/listinfo/gluster-users >> _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users