Well, after about 3 days of being in the healing state that file finally went to normal. All is good now but I am still uncertain if this behavior is normal. Thanks for the help, thought you'd like to know. Richard Klein RSI > -----Original Message----- > From: gluster-users-bounces@xxxxxxxxxxx [mailto:gluster-users- > bounces@xxxxxxxxxxx] On Behalf Of Richard Klein (RSI) > Sent: Friday, May 06, 2016 1:51 PM > To: gluster-users@xxxxxxxxxxx > Subject: Re: Question about "Possibly undergoing heal" on a file > being reported. > > I have performed the state dump as you requested on both replica pair hosts. > They hosts are called n1c1cl1 and n1c2cl1. The file is on /data/brick0/gv0cl1 > on both hosts. The attached zip file contains the state dump results of brick0 > for both hosts. I identified the process ID by finding the glusterfsd process that > had /data/brick0/gv0cl1 in the command line. > > Let me know if you need further information. > > P.S. The "Possibly undergoing heal" is still showing and the "trusted.afd.dirty" is > still changing even after the VM has been off since yesterday. > > Richard Klein > RSI > > > -----Original Message----- > > From: Ravishankar N [mailto:ravishankar@xxxxxxxxxx] > > Sent: Friday, May 06, 2016 12:05 AM > > To: Richard Klein (RSI); gluster-users@xxxxxxxxxxx > > Subject: Re: Question about "Possibly undergoing heal" > > on a file being reported. > > > > Thanks for the response. The healinfo outputs 'Possibly undergoing > > heal' only when the selfheal daemon is performing heal and not when > > there is IO from the mount. Could you provide the state dump of the 2 > > bricks (and the mount too if you know from which mount this vm image is > being accessed)? > > > > The command is `kill -USR1 <pid>` where pid is the process id of the > > brick or fuse mount. The statedump will be saved in `gluster > > --print-statedumpdir` Wanted to check if there are any stale locks being held > on the bricks. > > > > Thanks, > > Ravi > > > > On 05/06/2016 01:22 AM, Richard Klein (RSI) wrote: > > > I agree there is activity but it's very low I/O based, like updating > > > log files. It > > shouldn't be high enough IO to keep it permanently in the "Possibly > > undergoing healing" state for days. But just to make sure, I powered > > off the VM and there is no activity now at all and the > > "trusted.afr.dirty" is still changing. I will leave the VM in a > > powered off state until tomorrow. I agree with you that is shouldn't but that > is my dilemma. > > > > > > Thanks for the insight, > > > > > > Richard Klein > > > RSI > > > > > >> -----Original Message----- > > >> From: gluster-users-bounces@xxxxxxxxxxx [mailto:gluster-users- > > >> bounces@xxxxxxxxxxx] On Behalf Of Joe Julian > > >> Sent: Thursday, May 05, 2016 1:44 PM > > >> To: gluster-users@xxxxxxxxxxx > > >> Subject: Re: Question about "Possibly undergoing > > >> heal" on a file being reported. > > >> > > >> FYI, that's not "no activity". The file is clearly changing. The > > >> dirty state flipping back and forth between 1 and 0 is a byproduct > > >> of writes occurring. The clients set the flag, do the write, then clear the > flag. > > >> My guess is that's why it's only "possibly" undergoing self-heal. > > >> The write may have still been pending at the moment of the check. > > >> > > >> On 05/05/2016 10:22 AM, Richard Klein (RSI) wrote: > > >>> There are 2 hosts involved and we have a replica value of 2. The > > >>> hosts are > > >> called n1c1cl1 and n1c2cl1. Below is the info you requested. The > > >> file name in gluster is "/97f52c71-80bd-4c2b-8e47-3c8c77712687". > > >>> -- From the n1c1cl1 brick -- > > >>> > > >>> [root@n1c1cl1 ~]# ll -h > > >>> /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687 > > >>> -rwxr--r--. 2 root root 3.7G May 5 12:10 > > >>> /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687 > > >>> > > >>> [root@n1c1cl1 ~]# getfattr -d -m . -e hex > > >>> /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687 > > >>> getfattr: Removing leading '/' from absolute path names # file: > > >>> data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687 > > >>> > > >> > > > security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c74 > > >> 5 > > >>> f743a733000 > > >>> trusted.afr.dirty=0xe68000000000000000000000 > > >>> trusted.bit-rot.version=0x020000000000000057196a8d000e1606 > > >>> trusted.gfid=0xb1a49bd1ea01479f9a8277992461e85f > > >>> > > >>> -- From the n1c2cl1 brick -- > > >>> > > >>> [root@n1c2cl1 ~]# ll -h > > >>> /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687 > > >>> -rwxr--r--. 2 root root 3.7G May 5 12:16 > > >>> /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687 > > >>> > > >>> [root@n1c2cl1 ~]# getfattr -d -m . -e hex > > >>> /data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687 > > >>> getfattr: Removing leading '/' from absolute path names # file: > > >>> data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687 > > >>> > > >> > > > security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c74 > > >> 5 > > >>> f743a733000 > > >>> trusted.afr.dirty=0xd38000000000000000000000 > > >>> trusted.bit-rot.version=0x020000000000000057196a8d000e20ae > > >>> trusted.gfid=0xb1a49bd1ea01479f9a8277992461e85f > > >>> > > >>> -- > > >>> > > >>> The "trusted.afr.dirty" is changing about 2 or 3 times a minute on > > >>> both > > files. > > >> Let me know if you need further info and thanks. > > >>> Richard Klein > > >>> RSI > > >>> > > >>> > > >>> > > >>> From: Ravishankar N [mailto:ravishankar@xxxxxxxxxx] > > >>> Sent: Wednesday, May 04, 2016 8:52 PM > > >>> To: Richard Klein (RSI); gluster-users@xxxxxxxxxxx > > >>> Subject: Re: Question about "Possibly undergoing > > >>> heal" on a > > >> file being reported. > > >>> > > >>>> On 05/05/2016 01:50 AM, Richard Klein (RSI) wrote: > > >>>> First time e-mailer to the group, greetings all. We are using > > >>>> Gluster 3.7.6 in > > >> Cloudstack on CentOS7 with KVM. Gluster is our primary storage. > > >> All is going well >but we have a test VM QCOW2 volume that gets > > >> stuck in the "Possibly undergoing healing". By stuck I mean it > > >> stays in that state for over 24 hrs. This is a test VM >with no > > >> activity on it and we have removed the swap file on the guest as > > >> well thinking that may be causing high I/O. All the tools show > > >> that the VM is basically idle >with low I/O. The only way I can > > >> clear it up is to power the VM off, move the QCOW2 volume from the > > >> Gluster mount then back (basically remove and recreate it) >then > > >> power the VM back on. Once I do > > this process all is well again but then it happened again on the same > > volume/file. > > >>>> One additional note, I have even powered off the VM completely > > >>>> and the > > >> QCOW2 file still stays in this state. > > >>>> When this happens, can you share the output of the extended > > >>>> attributes of > > >> the file in question from all the bricks of the replica in which > > >> the file > > resides? > > >>> `getfattr -d -m . -e hex /path/to/bricks/file-name` > > >>> > > >>> Also what is the size of this VM image file? > > >>> > > >>> Thanks, > > >>> Ravi > > >>> > > >>> > > >>> > > >>>> Is there a way to stop/abort or force the heal to finish? Any > > >>>> help with a > > >> direction would be appreciated. > > >>>> Thanks, > > >>>> > > >>>> Richard Klein > > >>>> RSI > > >>> > > >>> > > >>> > > >>> _______________________________________________ > > >>> Gluster-users mailing list > > >>> Gluster-users@xxxxxxxxxxx > > >>> http://www.gluster.org/mailman/listinfo/gluster-users > > >>> > > >>> _______________________________________________ > > >>> Gluster-users mailing list > > >>> Gluster-users@xxxxxxxxxxx > > >>> http://www.gluster.org/mailman/listinfo/gluster-users > > >> _______________________________________________ > > >> Gluster-users mailing list > > >> Gluster-users@xxxxxxxxxxx > > >> http://www.gluster.org/mailman/listinfo/gluster-users > > > _______________________________________________ > > > Gluster-users mailing list > > > Gluster-users@xxxxxxxxxxx > > > http://www.gluster.org/mailman/listinfo/gluster-users > > _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users