Re: Help with reconnecting a faulty brick

Ravishankar N <ravishankar@xxxxxxxxxx> · Fri, 17 Nov 2017 17:40:25 +0530

On 11/17/2017 03:41 PM, Daniel Berteaud wrote:
Le Jeudi, Novembre 16, 2017 13:07 CET, Ravishankar N <ravishankar@xxxxxxxxxx> a écrit:

On 11/16/2017 12:54 PM, Daniel Berteaud wrote:
Any way in this situation to check which file will be healed from
which brick before reconnecting ? Using some getfattr tricks ?
Yes, there are afr xattrs that determine the heal direction for each
file. The good copy will have non-zero trusted.afr* xattrs that blame
the bad one and heal will happen from good to bad.  If both bricks have
attrs blaming the other, then the file is in split-brain.
Thanks.

So, say I have a file with this on the correct node
# file: mnt/bricks/vmstore/prod/bilbao_sys.qcow2
security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.vmstore-client-0=0x00050f7e0000000200000000
trusted.afr.vmstore-client-1=0x000000000000000100000000
trusted.gfid=0xe86c24e5fc6b4fc6bf2b896f3cc8537d

And this on the bad one

# file: mnt/bricks/vmstore/prod/bilbao_sys.qcow2
security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000
trusted.afr.vmstore-client-0=0x000000000000000000000000
trusted.afr.vmstore-client-1=0x000000000000000000000000
trusted.gfid=0xe86c24e5fc6b4fc6bf2b896f3cc8537d

I can guarantee Gluster will heal from the correct one to the bad. And in case of both having a non nul afr, I can manually (using setfattr) set the afr attribute to a null value before reconnecting the faulty brick, and it'll heal from the correct one.
With the above xattrs, the correct node will be used as source. Ideally, 
good node must have trusted.afr.vmstore-client-1 as non-zero (IOWs it 
blames the second brick, i.e client-1) and the bad node must have all 
zeroes.
And for files which have been deleted/renamed/created on the correct node while the bad one was offline, how are those handled ? For example, I have
The parent directory will have afr xattrs indicating good and back 
bricks. All gfids not present in good will be deleted from bad if 
present. All gfids present in good will be created on the bad if not 
present. 
https://github.com/gluster/glusterfs-specs/blob/master/done/Features/afr-v1.md 
can give you an idea of how replication works in glusterfs-3.5 (which 
you are using) and earlier.
-Ravi

/mnt/bricks/vmstore/prod/contis_sys.qcow2 ont btoh bricks. But, on the correct one, the file was deleted and recreated while the bad one was offline. So they haven't the same gfid now. How does gluster handle this ?

Sorry for all those questions, I'm just a bit nervous :-)

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users