Self-heal never finishes

Ben <gravyfish@xxxxxxxxx> · Sun, 28 Feb 2021 21:41:43 -0500

I'm having a problem where once one of my volumes requires healing, it never finishes the process. I use a 3-node replica cluster (2 node + arbiter) as oVirt storage for virtual machines. I'm using Gluster version 8.3.

When I patch my Gluster nodes, I try to keep the system online by rebooting them one at a time. However, I've found that once I reboot node 2, when it comes back up, self-heal will begin on both node 1 and the arbiter and never finish. I have let it run for weeks and still have entries in gluster volume heal <volname> info. No heal entries are reported on the node that rebooted.

I've set the volumes to the virt group (gluster volume set <volname> group virt) per the RHEV documentation, and the gluster nodes don't seem to be overly busy. I'm hoping someone can point me in the right direction -- since the volumes never heal, I'm basically running on one node. Let me know what additional info will be helpful for troubleshooting, and thank you in advance.
________

Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users