Gluster endless heal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I have an issue with Gluster 3.8.14.
The cluster is 4 nodes with replica count 2, on of the nodes went offline for around 15 minutes, when it came back online, self heal triggered and it just did not stop afterward, it's been running for 3 days now, maxing the bricks utilization without actually healing anything.
The bricks are all SSDs, and the logs of the source node is spamming with the following messages; 
 
[2018-01-17 18:37:11.815247] I [MSGID: 108026] [afr-self-heal-common.c:1254:afr_log_selfheal] 0-ovirt_imgs-replicate-0: Completed data selfheal on 450fb07a-e95d-48ef-a229-48917557c278. sources=[0]  sinks=1 
[2018-01-17 18:37:12.830887] I [MSGID: 108026] [afr-self-heal-metadata.c:51:__afr_selfheal_metadata_do] 0-ovirt_imgs-replicate-0: performing metadata selfheal on ce0f545d-635a-40c0-95eb-ccfc71971f78
[2018-01-17 18:37:12.845978] I [MSGID: 108026] [afr-self-heal-common.c:1254:afr_log_selfheal] 0-ovirt_imgs-replicate-0: Completed metadata selfheal on ce0f545d-635a-40c0-95eb-ccfc71971f78. sources=[0]  sinks=1

---

I tried restarting glusterd and rebooting the node after about 24 hours of healing, but it just did not help, i had like several bricks doing heal and after rebooting it's now only 4 bricks doing heal.

The volume is used for oVirt storage domain with sharding enabled.
No errors or warnings on both nodes, just info messages about afr healing.

any idea whats going on or where should i start looking ?

--

Respectfully
Mahdi A. Mahdi


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux