On May 5, 2016 10:04:41 PM PDT, Ravishankar N <ravishankar@xxxxxxxxxx> wrote:
Thanks for the response. The healinfo outputs 'Possibly undergoing
heal' only when the selfheal daemon is performing heal and not when
there is IO from the mount. Could you provide the state dump of the 2
bricks (and the mount too if you know from which mount this vm image is
being accessed)?
The command is `kill -USR1 <pid>` where pid is the process id of the
brick or fuse mount. The statedump will be saved in `gluster
--print-statedumpdir`
Wanted to check if there are any stale locks being held on the bricks.
Thanks,
Ravi
On 05/06/2016 01:22 AM, Richard Klein (RSI) wrote:I agree there is activity but it's very low I/O based, like updating log files. It shouldn't be high enough IO to keep it permanently in the "Possibly undergoing healing" state for days. But just to make sure, I powered off the VM and there is no activity now at all and the "trusted.afr.dirty" is still changing. I will leave the VM in a powered off state until tomorrow. I agree with you that is shouldn't but that is my dilemma.
Thanks for the insight,
Richard Klein
RSI-----Original Message-----
From: gluster-users-bounces@xxxxxxxxxxx [mailto:gluster-users-
bounces@xxxxxxxxxxx] On Behalf Of Joe Julian
Sent: Thursday, May 05, 2016 1:44 PM
To: gluster-users@xxxxxxxxxxx
Subject: Re: Question about "Possibly undergoing heal" on a file
being reported.
FYI, that's not "no activity". The file is clearly changing. The dirty state flipping
back and forth between 1 and 0 is a byproduct of writes occurring. The clients
set the flag, do the write, then clear the flag.
My guess is that's why it's only "possibly" undergoing self-heal. The write may
have still been pending at the moment of the check.
On 05/05/2016 10:22 AM, Richard Klein (RSI) wrote:There are 2 hosts involved and we have a replica value of 2. The hosts arecalled n1c1cl1 and n1c2cl1. Below is the info you requested. The file name in
gluster is "/97f52c71-80bd-4c2b-8e47-3c8c77712687".-- >From the n1c1cl1 brick --
[root@n1c1cl1 ~]# ll -h
/data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687
-rwxr--r--. 2 root root 3.7G May 5 12:10
/data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687
[root@n1c1cl1 ~]# getfattr -d -m . -e hex
/data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687
getfattr: Removing leading '/' from absolute path names # file:
data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687
security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c74
5f743a733000
trusted.afr.dirty=0xe68000000000000000000000
trusted.bit-rot.version=0x020000000000000057196a8d000e1606
trusted.gfid=0xb1a49bd1ea01479f9a8277992461e85f
-- From the n1c2cl1 brick --
[root@n1c2cl1 ~]# ll -h
/data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687
-rwxr--r--. 2 root root 3.7G May 5 12:16
/data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687
[root@n1c2cl1 ~]# getfattr -d -m . -e hex
/data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687
getfattr: Removing leading '/' from absolute path names # file:
data/brick0/gv0cl1/97f52c71-80bd-4c2b-8e47-3c8c77712687
security.selinux=0x73797374656d5f753a6f626a6563745f723a64656661756c74
5f743a733000Let me know if you need further info and thanks.
trusted.afr.dirty=0xd38000000000000000000000
trusted.bit-rot.version=0x020000000000000057196a8d000e20ae
trusted.gfid=0xb1a49bd1ea01479f9a8277992461e85f
--
The "trusted.afr.dirty" is changing about 2 or 3 times a minute on both files.Richard Kleinfile being reported.
RSI
From: Ravishankar N [mailto:ravishankar@xxxxxxxxxx]
Sent: Wednesday, May 04, 2016 8:52 PM
To: Richard Klein (RSI); gluster-users@xxxxxxxxxxx
Subject: Re: Question about "Possibly undergoing heal" on aCloudstack on CentOS7 with KVM. Gluster is our primary storage. All is goingOn 05/05/2016 01:50 AM, Richard Klein (RSI) wrote:
First time e-mailer to the group, greetings all. We are using Gluster 3.7.6 in
well >but we have a test VM QCOW2 volume that gets stuck in the "Possibly
undergoing healing". By stuck I mean it stays in that state for over 24 hrs. This
is a test VM >with no activity on it and we have removed the swap file on the
guest as well thinking that may be causing high I/O. All the tools show that the
VM is basically idle >with low I/O. The only way I can clear it up is to power
the VM off, move the QCOW2 volume from the Gluster mount then back
(basically remove and recreate it) >then power the VM back on. Once I do this
process all is well again but then it happened again on the same volume/file.QCOW2 file still stays in this state.One additional note, I have even powered off the VM completely and thethe file in question from all the bricks of the replica in which the file resides?When this happens, can you share the output of the extended attributes of`getfattr -d -m . -e hex /path/to/bricks/file-name`direction would be appreciated.
Also what is the size of this VM image file?
Thanks,
RaviIs there a way to stop/abort or force the heal to finish? Any help with aThanks,
Richard Klein
RSI
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users