Re: poor performance during healing

Ravishankar N <ravishankar@xxxxxxxxxx> · Tue, 24 Feb 2015 07:11:08 +0530

On 02/24/2015 05:00 AM, Craig Yoshioka wrote:
I’m using Gluster 3.6 to host a volume with some KVM images.  I’d seen before that other people were having terrible performance while Gluster was auto-healing but that a rewrite in 3.6 had potentially solved this problem.

Well, it hasn’t (for me).  If my gluster volume starts to auto-heal, performance can get so bad that some of the VMs essentially lock up.  In top I can see the glusterfsd process sometime hitting 700% of the CPU.  Is there anything I can do to prevent this by throttling the healing process?
For VM workloads, you could set the 'cluster.data-self-heal-algorithm' 
option to 'full'. The checksum computation in the 'diff' algorithm can 
be cpu intensive, especially since VM images are big files.

[root@tuxpad glusterfs]# gluster v set help|grep algorithm
Option: cluster.data-self-heal-algorithm
Description: Select between "full", "diff". The "full" algorithm copies 
the entire file from source to sink. The "diff" algorithm copies to sink 
only those blocks whose checksums don't match with those of source. If 
no option is configured the option is chosen dynamically as follows: If 
the file does not exist on one of the sinks or empty file exists or if 
the source file size is about the same as page size the entire file will 
be read and written i.e "full" algo, otherwise "diff" algo is chosen.

Hope this helps.
Ravi

Here are my volume options:

Volume Name: vm-images
Type: Replicate
Volume ID: 5b38ddbe-a1ae-4e10-b0ad-dcd785a44493
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: vmhost-1:/gfs/brick-0
Brick2: vmhost-2:/gfs/brick-0
Options Reconfigured:
nfs.disable: on
cluster.quorum-count: 1
network.frame-timeout: 1800
network.ping-timeout: 15
server.allow-insecure: on
storage.owner-gid: 36
storage.owner-uid: 107
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: fixed
cluster.server-quorum-type: server
cluster.server-quorum-ratio: 51%

Thanks!
-Craig
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users