Self-healing issue with reuse of brick using replace-brick which already have old data.

Samir Biswas <Samir.Biswas@xxxxxxxx> · Tue, 22 Dec 2015 05:07:42 +0000

Hi All,

Glusterfs does self-healing when replace brick command is issued to change brick from one to another.

But it also can happen that we can replace a brick which was attached earlier to the replica pair and have old data in it.

There could be some modification of files in the new brick and then moved to old brick. The files which are modified

new brick are not healed to old brick but newly added files are healed immediately.

Below example that I tried
Replica 3 pair of nodes as below:

Node 1: 10.132.10.50 (blr-7)
Node 2: 10.132.10.110 (blr-20)
Node 3: 10.132.10.56 (blr-12)

Step 1: Issue brick replace to new node(blr-12)

[root@v3blr7 ~]# gluster volume replace-brick vs 10.132.10.110:/home/sameer/setup_2/blr-20  10.132.10.56:/home/sameer/setup_2/blr-12 commit force
volume replace-brick: success: replace-brick commit force operation successful
[root@v3blr7 ~]#

Step 2: healing started
[root@v3blr12 blr-12]# gluster volume heal vs info
Brick 10.132.10.50:/home/sameer/setup_2/blr-7
/heal_issue.0.0 - Possibly undergoing heal
/heal_issue.1.0
Number of entries: 2
Brick 10.132.10.56:/home/sameer/setup_2/blr-12
Number of entries: 0
After Healing is finished:

[root@v3blr12 blr-12]# [root@v3blr12 blr-12]# ls -lh
total 2.1G
-rw-r--r--. 2 root root 1.0G Dec 13 23:26 heal_issue.0.0
-rw-r--r--. 2 root root 1.0G Dec 13 23:26 heal_issue.1.0
[root@v3blr12 blr-12]#

Step 3:  Run FIO to update the files and create two more files.

[root@v3blr7 blr-7]# ls -lh
total 8.1G
-rw-r--r--. 2 root root 2.0G Dec 13 23:33 heal_issue.0.0
-rw-r--r--. 2 root root 2.0G Dec 13 23:33 heal_issue.1.0
-rw-r--r--. 2 root root 2.0G Dec 13 23:33 heal_issue.2.0
-rw-r--r--. 2 root root 2.0G Dec 13 23:33 heal_issue.3.0
[root@v3blr7 blr-7]#

Step 4: Issue replace brick to old brick(from blr-12 to blr-20)
[root@v3blr7 ~]# gluster volume replace-brick vs 10.132.10.56:/home/sameer/setup_2/blr-12 10.132.10.110:/home/sameer/setup_2/blr-20 commit force
volume replace-brick: success: replace-brick commit force operation successful

Step 4: Healing finished to failed node
[root@v3blr20 blr-20]# gluster v heal vs info
Brick 10.132.10.50:/home/sameer/setup_2/blr-7
Number of entries: 0
Brick 10.132.10.110:/home/sameer/setup_2/blr-20
Number of entries: 0

Step 5: File size of failed nodes
[root@v3blr20 blr-20]# ls -lh
total 6.1G
-rw-r--r--. 2 root root 1.0G Dec 14 10:00 heal_issue.0.0   <=====
File size is 1gb. did not heal
-rw-r--r--. 2 root root 1.0G Dec 14 10:00 heal_issue.1.0   <=====
File size is 1gb. did not heal
-rw-r--r--. 2 root root 2.0G Dec 14 10:03 heal_issue.2.0
-rw-r--r--. 2 root root 2.0G Dec 14 10:03 heal_issue.3.0
[root@v3blr20 blr-20]#
The old files which are modified to other replica  is not synced here and even it won’t show as split-brain too.

How will users get informed that modified files are not synced? This seems an issue !!!!

Does glusterfs not allow to reuse(replace-brick) old brick which could have data? If not, there is need of

healing data 2 times if there gets some issue in one brick and adding that fixing the brick. Seems this is not good idea to do.

Does anyone face this issue? Or this could be issue in glusterfs itself.  

Thanks and regards,
Samir Biswas

HGST E-mail
Confidentiality Notice & Disclaimer:
This e-mail and any files transmitted with it may contain confidential
or legally privileged information of HGST and are intended solely for the use
of the individual or entity to which they are addressed. If you are not the
intended recipient, any disclosure, copying, distribution or any action taken
or omitted to be taken in reliance on it, is prohibited.  If you have received this e-mail in error,
please notify the sender immediately and delete the e-mail in its entirety from
your system.

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users