After upgrade from 3.4.2 to 3.8.5 - High CPU usage resulting in disconnects and split-brain

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I upgraded an installation of GlusterFS on Ubuntu 14.04.3 from version 3.4.2 to 3.8.5.
Few hours after the upgrade, I noticed files in "split-brain" state. I never had split-brain files in months of operation before, with the old version.

Using htop, I observed the "glusterfs" process jumping from 0% to 100+% CPU usage every now and then.
Using iostat, I confirmed there is no bottleneck on the local disks (util is well below 10%)

Inspecting the logfiles, it looks like clients are losing connection quite often:

[2016-11-14 16:34:56.685349] C [rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired] 0-gv0-client-1: server X.X.X.62:49152 has not responded in the last 42 seconds, disconnecting.
[2016-11-14 16:35:47.690348] C [rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired] 0-gv0-client-8: server X.X.X.219:49153 has not responded in the last 42 seconds, disconnecting.
[2016-11-14 17:09:33.903096] C [rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired] 0-gv0-client-7: server X.X.X.62:49153 has not responded in the last 42 seconds, disconnecting.

There are a total of 6 servers with 2 bricks each (Distribute-Replicate)

The result of a 60 second gluster volume profile can be seen here: http://pastebin.com/5WN5S63B

After upgrading, I set:

cluster.granular-entry-heal yes
cluster.locking-scheme granular

I now reverted to no/full to see if files are still going "split-brain".

Best regards
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux