Re: glusterfsd process thrashing CPU

Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> · Tue, 18 Nov 2014 14:36:19 +0530

On 11/18/2014 01:17 PM, Lindsay Mathieson wrote:
On 18 November 2014 17:40, Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> wrote:
Sorry didn't see this one. I think this is happening because of 'diff' based
self-heal which does full file checksums, that I believe is the root cause.
Could you execute 'gluster volume set <volname>
cluster.data-self-heal-algorithm full' to prevent this issue in future. But
this option will be effective for the new self-heals that will be triggered
after the execution of the command. The ongoing ones will still use the old
mode of self-heal.
Thanks, makes sense.

However given the files are tens of GB in size, won't it thrash my network?
Yes you are right. I wonder why thrashing of the network is never 
reported till now.
+Joejulian who also uses VMs on gluster(for 5 years now?). He uses this 
option of full self-heal (Thats what I saw in his bug reports).

I still need to think about how best to solve this problem.

Let me tell you a bit more about this issue:
there are two processes which heal the VM images:
1) self-heal-daemon. 2) Mount process.
Self-heal daemon heals one VM image at a time. But mount process 
triggers self-heals for all the opened files(VM image is nothing but an 
opened file from filesystem's perspective) when a brick goes down and 
comes backup. So we need to come up with a scheme to throttle self-heals 
on the mount point to prevent this issue. I will update you as soon as I 
come up with a fix. This should not be hard to do. Need some time to 
choose the best approach. Thanks a lot for bringing up this issue.

Pranith
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users