I just saw the following bug which was fixed in 3.8.15:
Is it possible that the problem I described in this post is related to that bug?
-------- Original Message --------Subject: Re: [Gluster-users] self-heal not workingLocal Time: August 22, 2017 11:51 AMUTC Time: August 22, 2017 9:51 AMFrom: ravishankar@xxxxxxxxxxTo: mabi <mabi@xxxxxxxxxxxxx>Ben Turner <bturner@xxxxxxxxxx>, Gluster Users <gluster-users@xxxxxxxxxxx>
On 08/22/2017 02:30 PM, mabi wrote:Thanks for the additional hints, I have the following 2 questions first:- In order to launch the index heal is the following command correct:gluster volume heal myvolumeYes- If I run a "volume start force" will it have any short disruptions on my clients which mount the volume through FUSE? If yes, how long? This is a production system that's why I am asking.No. You can actually create a test volume on your personal linux box to try these kinds of things without needing multiple machines. This is how we develop and test our patches :)'gluster volume create testvol replica 3 /home/mabi/bricks/brick{1..3} force` and so on.HTH,Ravi-------- Original Message --------Subject: Re: [Gluster-users] self-heal not workingLocal Time: August 22, 2017 6:26 AMUTC Time: August 22, 2017 4:26 AMFrom: ravishankar@xxxxxxxxxxGluster Users <gluster-users@xxxxxxxxxxx>Explore the following:
- Launch index heal and look at the glustershd logs of all bricks for possible errors
- See if the glustershd in each node is connected to all bricks.
- If not try to restart shd by `volume start force`
- Launch index heal again and try.
- Try debugging the shd log by setting client-log-level to DEBUG temporarily.On 08/22/2017 03:19 AM, mabi wrote:Sure, it doesn't look like a split brain based on the output:Brick node1.domain.tld:/data/myvolume/brickStatus: ConnectedNumber of entries in split-brain: 0Brick node2.domain.tld:/data/myvolume/brickStatus: ConnectedNumber of entries in split-brain: 0Brick node3.domain.tld:/srv/glusterfs/myvolume/brickStatus: ConnectedNumber of entries in split-brain: 0-------- Original Message --------Subject: Re: [Gluster-users] self-heal not workingLocal Time: August 21, 2017 11:35 PMUTC Time: August 21, 2017 9:35 PMFrom: bturner@xxxxxxxxxxTo: mabi <mabi@xxxxxxxxxxxxx>Gluster Users <gluster-users@xxxxxxxxxxx>Can you also provide:gluster v heal <my vol> info split-brainIf it is split brain just delete the incorrect file from the brick and run heal again. I haven"t tried this with arbiter but I assume the process is the same.-b----- Original Message -----> From: "mabi" <mabi@xxxxxxxxxxxxx>> To: "Ben Turner" <bturner@xxxxxxxxxx>> Cc: "Gluster Users" <gluster-users@xxxxxxxxxxx>> Sent: Monday, August 21, 2017 4:55:59 PM> Subject: Re: [Gluster-users] self-heal not working>> Hi Ben,>> So it is really a 0 kBytes file everywhere (all nodes including the arbiter> and from the client).> Here below you will find the output you requested. Hopefully that will help> to find out why this specific file is not healing... Let me know if you need> any more information. Btw node3 is my arbiter node.>> NODE1:>> STAT:> File:> ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’> Size: 0 Blocks: 38 IO Block: 131072 regular empty file> Device: 24h/36d Inode: 10033884 Links: 2> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)> Access: 2017-08-14 17:04:55.530681000 +0200> Modify: 2017-08-14 17:11:46.407404779 +0200> Change: 2017-08-14 17:11:46.407404779 +0200> Birth: ->> GETFATTR:> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA> trusted.bit-rot.version=0sAgAAAAAAAABZhuknAAlJAg==> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOyo=>> NODE2:>> STAT:> File:> ‘/data/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png’> Size: 0 Blocks: 38 IO Block: 131072 regular empty file> Device: 26h/38d Inode: 10031330 Links: 2> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)> Access: 2017-08-14 17:04:55.530681000 +0200> Modify: 2017-08-14 17:11:46.403704181 +0200> Change: 2017-08-14 17:11:46.403704181 +0200> Birth: ->> GETFATTR:> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA> trusted.bit-rot.version=0sAgAAAAAAAABZhu6wAA8Hpw==> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOVE=>> NODE3:> STAT:> File:> /srv/glusterfs/myvolume/brick/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png> Size: 0 Blocks: 0 IO Block: 4096 regular empty file> Device: ca11h/51729d Inode: 405208959 Links: 2> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)> Access: 2017-08-14 17:04:55.530681000 +0200> Modify: 2017-08-14 17:04:55.530681000 +0200> Change: 2017-08-14 17:11:46.604380051 +0200> Birth: ->> GETFATTR:> trusted.afr.dirty=0sAAAAAQAAAAAAAAAA> trusted.bit-rot.version=0sAgAAAAAAAABZe6ejAAKPAg==> trusted.gfid=0sGYXiM9XuTj6lGs8LX58q6g==> trusted.glusterfs.d99af2fa-439b-4a21-bf3a-38f3849f87ec.xtime=0sWZG9sgAGOc4=>> CLIENT GLUSTER MOUNT:> STAT:> File:> "/mnt/myvolume/data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png"> Size: 0 Blocks: 0 IO Block: 131072 regular empty file> Device: 1eh/30d Inode: 11897049013408443114 Links: 1> Access: (0644/-rw-r--r--) Uid: ( 33/www-data) Gid: ( 33/www-data)> Access: 2017-08-14 17:04:55.530681000 +0200> Modify: 2017-08-14 17:11:46.407404779 +0200> Change: 2017-08-14 17:11:46.407404779 +0200> Birth: ->> > -------- Original Message --------> > Subject: Re: [Gluster-users] self-heal not working> > Local Time: August 21, 2017 9:34 PM> > UTC Time: August 21, 2017 7:34 PM> > From: bturner@xxxxxxxxxx> > To: mabi <mabi@xxxxxxxxxxxxx>> > Gluster Users <gluster-users@xxxxxxxxxxx>> >> > ----- Original Message -----> >> From: "mabi" <mabi@xxxxxxxxxxxxx>> >> To: "Gluster Users" <gluster-users@xxxxxxxxxxx>> >> Sent: Monday, August 21, 2017 9:28:24 AM> >> Subject: [Gluster-users] self-heal not working> >>> >> Hi,> >>> >> I have a replicat 2 with arbiter GlusterFS 3.8.11 cluster and there is> >> currently one file listed to be healed as you can see below but never gets> >> healed by the self-heal daemon:> >>> >> Brick node1.domain.tld:/data/myvolume/brick> >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png> >> Status: Connected> >> Number of entries: 1> >>> >> Brick node2.domain.tld:/data/myvolume/brick> >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png> >> Status: Connected> >> Number of entries: 1> >>> >> Brick node3.domain.tld:/srv/glusterfs/myvolume/brick> >> /data/appdata_ocpom4nckwru/preview/1344699/64-64-crop.png> >> Status: Connected> >> Number of entries: 1> >>> >> As once recommended on this mailing list I have mounted that glusterfs> >> volume> >> temporarily through fuse/glusterfs and ran a "stat" on that file which is> >> listed above but nothing happened.> >>> >> The file itself is available on all 3 nodes/bricks but on the last node it> >> has a different date. By the way this file is 0 kBytes big. Is that maybe> >> the reason why the self-heal does not work?> >> > Is the file actually 0 bytes or is it just 0 bytes on the arbiter(0 bytes> > are expected on the arbiter, it just stores metadata)? Can you send us the> > output from stat on all 3 nodes:> >> > $ stat <file on back end brick>> > $ getfattr -d -m - <file on back end brick>> > $ stat <file from gluster mount>> >> > Lets see what things look like on the back end, it should tell us why> > healing is failing.> >> > -b> >> >>> >> And how can I now make this file to heal?> >>> >> Thanks,> >> Mabi> >>> >>> >>> >>> >> _______________________________________________> >> Gluster-users mailing list_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users