Re: Entries in heal pending

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Strahil,

Thanks for the input. It worked flawlesly!
I copy/paste the process here (maybe useful for someone).

### entering the "corrupt" gluster brick folder
[root@node1 data]# cd 7ac28c32-947b-4ad5-8d69-213a205f06e8/images/d7c11f2e-58e4-4fe1-8236-6ded0f4dd757/
[root@node1 d7c11f2e-58e4-4fe1-8236-6ded0f4dd757]# ls -la
total 1080
drwxr-xr-x. 2 vdsm kvm    8192 Jan  1  1970 .
drwxr-xr-x. 4 vdsm kvm      94 Dec 18 11:11 ..
-rw-rw----. 2 vdsm kvm   30720 Dec 17 13:17 e566f230-df72-4073-aecf-7e5a8d6b569b
-rw-rw----. 2 vdsm kvm 1048576 Dec  2 14:55 e566f230-df72-4073-aecf-7e5a8d6b569b.lease
-rw-r--r--. 2 vdsm kvm     429 Dec 17 13:17 e566f230-df72-4073-aecf-7e5a8d6b569b.meta
### making a backup of the files (did it on all the nodes)
[root@node1 d7c11f2e-58e4-4fe1-8236-6ded0f4dd757]# mkdir -p /root/save/gluster_bricks/data/data/7ac28c32-947b-4ad5-8d69-213a205f06e8/images/d7c11f2e-58e4-4fe1-8236-6ded0f4dd757
[root@node1 d7c11f2e-58e4-4fe1-8236-6ded0f4dd757]# cp * /root/save/gluster_bricks/data/data/7ac28c32-947b-4ad5-8d69-213a205f06e8/images/d7c11f2e-58e4-4fe1-8236-6ded0f4dd757/
### rsyncing the files from the selected source (i choosen node2 as soruce, and did this on node3 also)
[root@node1 d7c11f2e-58e4-4fe1-8236-6ded0f4dd757]# rsync -avh root@node2.ovirt.local:/gluster_bricks/data/data/7ac28c32-947b-4ad5-8d69-213a205f06e8/images/d7c11f2e-58e4-4fe1-8236-6ded0f4dd757/e566f230-df72-4073-aecf-7e5a8d6b569b* .
receiving incremental file list

sent 20 bytes  received 129 bytes  99.33 bytes/sec
total size is 1.08M  speedup is 7,246.48
### started the healing
[root@node1 d7c11f2e-58e4-4fe1-8236-6ded0f4dd757]# gluster volume heal data
Launching heal operation to perform index self heal on volume data has been successful
Use heal info commands to check status.
### checking result
[root@node1 d7c11f2e-58e4-4fe1-8236-6ded0f4dd757]# gluster volume heal data info
Brick node1storage.ovirt.local:/gluster_bricks/data/data
/7ac28c32-947b-4ad5-8d69-213a205f06e8/images/079904a4-71af-492c-bb2f-b45a918e8a2e/fce4b64d-2444-4f11-b226-db75bb2960c2.meta
Status: Connected
Number of entries: 1

Brick node2storage.ovirt.local:/gluster_bricks/data/data
/7ac28c32-947b-4ad5-8d69-213a205f06e8/images/079904a4-71af-492c-bb2f-b45a918e8a2e/fce4b64d-2444-4f11-b226-db75bb2960c2.meta
Status: Connected
Number of entries: 1

Brick node3storage.ovirt.local:/gluster_bricks/data/data
Status: Connected
Number of entries: 0

### As you can see the only one pending entry left. I did the above fix for the other files, and it also healed successfully.

Best Regards,
  Balazs Szilagyi

2019.12.18. 20:07 keltezéssel, Strahil Nikolov írta:
You are the second person (excluding me) who observes this behaviour.
The easiest way to resolve this is to:
1. Check which file is newest (there is a timestamp in the file) for :
/gluster_bricks/data/data/7ac28c32-947b-4ad5-8d69-213a205f06e8/images/d7c11f2e-58e4-4fe1-8236-6ded0f4dd757/e566f230-df72-4073-aecf-7e5a8d6b569b.meta
And for :
/gluster_bricks/data/data/7ac28c32-947b-4ad5-8d69-213a205f06e8/images/079904a4-71af-492c-bb2f-b45a918e8a2e/fce4b64d-2444-4f11-b226-db75bb2960c2.meta

Let's assume node3storage.ovirt.local has the newest data.

2.Then you can just backup (just in case you change your mind) the files locally and rsync from node3storage.ovirt.local (actually replace with the node with newest timestamp in the file) to the other bricks.

3. Run a gluster heal just to notify gluster that the issue is resolved.

In my case one of the nodes had a newer version of the file (I am using replica 2 arbiter 1 volume) and the gfid was different and this prevented Gluster from healing that.

Usually , oVirt just updates the timestamp in the meta files - so even an older version is not a problem.

P.S.: What version of gluster are you using. I suppose v6.5 or v6.6 , right ?

Best Regards,
Strahil Nikolov




В сряда, 18 декември 2019 г., 16:18:12 ч. Гринуич+2, Szilágyi Balázs <szilagyi.balazs@xxxxxxxxxx> написа:


Dear Gluster Users,

I'm a newbie to gluster storage and during the stability testing I made
a node reboot and I've got some heal issues after that, that i'm unable
to fix.
Anyway the vm's are running fine from the storage, and I did not
discovered any data corruption.
The system is oVirt version 4.3.7. I have 3 nodes with a replica 3.
Please let me know what to do with the pending heals that are unable to
finish.
Also let me know if some more details are necessary.

Thanks,
  Balazs

[root@node2 ~]# gluster volume status data
Status of volume: data
Gluster process                             TCP Port  RDMA Port Online  Pid
------------------------------------------------------------------------------
Brick node1storage.ovirt.local:/gluster_bri
cks/data/data                               49152     0 Y       4187
Brick node2storage.ovirt.local:/gluster_bri
cks/data/data                               49154     0 Y       6163
Brick node3storage.ovirt.local:/gluster_bri
cks/data/data                               49154     0 Y       19439
Self-heal Daemon on localhost               N/A       N/A Y       3136
Self-heal Daemon on node3storage.ovirt.loca
l                                           N/A       N/A Y       5876
Self-heal Daemon on node1storage.ovirt.loca
l                                           N/A       N/A Y       15479

Task Status of Volume data
------------------------------------------------------------------------------
There are no active volume tasks

[root@node2 ~]# gluster volume heal data info summary
Brick node1storage.ovirt.local:/gluster_bricks/data/data
Status: Connected
Total Number of entries: 2
Number of entries in heal pending: 2
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick node2storage.ovirt.local:/gluster_bricks/data/data
Status: Connected
Total Number of entries: 2
Number of entries in heal pending: 2
Number of entries in split-brain: 0
Number of entries possibly healing: 0

Brick node3storage.ovirt.local:/gluster_bricks/data/data
Status: Connected
Total Number of entries: 0
Number of entries in heal pending: 0
Number of entries in split-brain: 0
Number of entries possibly healing: 0

[root@node2 ~]# gluster volume heal data info
Brick node1storage.ovirt.local:/gluster_bricks/data/data
/7ac28c32-947b-4ad5-8d69-213a205f06e8/images/d7c11f2e-58e4-4fe1-8236-6ded0f4dd757/e566f230-df72-4073-aecf-7e5a8d6b569b.meta
/7ac28c32-947b-4ad5-8d69-213a205f06e8/images/079904a4-71af-492c-bb2f-b45a918e8a2e/fce4b64d-2444-4f11-b226-db75bb2960c2.meta
Status: Connected
Number of entries: 2

Brick node2storage.ovirt.local:/gluster_bricks/data/data
/7ac28c32-947b-4ad5-8d69-213a205f06e8/images/d7c11f2e-58e4-4fe1-8236-6ded0f4dd757/e566f230-df72-4073-aecf-7e5a8d6b569b.meta
/7ac28c32-947b-4ad5-8d69-213a205f06e8/images/079904a4-71af-492c-bb2f-b45a918e8a2e/fce4b64d-2444-4f11-b226-db75bb2960c2.meta
Status: Connected
Number of entries: 2

Brick node3storage.ovirt.local:/gluster_bricks/data/data
Status: Connected
Number of entries: 0

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT

Gluster-users mailing list

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux