In addition, my self heal split-brain info has lots of repeated gfid strings. How may I deal with such situation? gluster> v heal staticvol info split-brain Gathering Heal info on volume staticvol has been successful Brick brick01:/exports/static Number of entries: 108 at path on brick ----------------------------------- 2012-11-30 10:50:14 <gfid:f4a9368c-aa7b-4696-b17a-d09e3d26d5b9> 2012-11-30 09:32:46 <gfid:f4a9368c-aa7b-4696-b17a-d09e3d26d5b9> 2012-11-30 09:32:43 <gfid:022fd9fc-d725-4066-acbd-1e2b224710b0> 2012-11-30 09:23:36 <gfid:f4a9368c-aa7b-4696-b17a-d09e3d26d5b9> 2012-11-30 09:23:33 <gfid:022fd9fc-d725-4066-acbd-1e2b224710b0> 2012-11-30 09:10:14 <gfid:f4a9368c-aa7b-4696-b17a-d09e3d26d5b9> 2012-11-30 09:10:14 <gfid:022fd9fc-d725-4066-acbd-1e2b224710b0> 2012-11-30 08:59:15 <gfid:f4a9368c-aa7b-4696-b17a-d09e3d26d5b9> ... Brick brick02:/exports/static Number of entries: 499 at path on brick ----------------------------------- 2012-11-30 11:54:47 <gfid:f4a9368c-aa7b-4696-b17a-d09e3d26d5b9> 2012-11-30 10:39:55 <gfid:022fd9fc-d725-4066-acbd-1e2b224710b0> 2012-11-30 10:39:54 <gfid:f4a9368c-aa7b-4696-b17a-d09e3d26d5b9> 2012-11-30 10:29:55 <gfid:022fd9fc-d725-4066-acbd-1e2b224710b0> 2012-11-30 10:29:54 <gfid:f4a9368c-aa7b-4696-b17a-d09e3d26d5b9> 2012-11-30 10:19:55 <gfid:022fd9fc-d725-4066-acbd-1e2b224710b0> 2012-11-30 10:19:54 <gfid:f4a9368c-aa7b-4696-b17a-d09e3d26d5b9> 2012-11-30 10:09:56 <gfid:022fd9fc-d725-4066-acbd-1e2b224710b0> 2012-11-30 10:09:55 <gfid:f4a9368c-aa7b-4696-b17a-d09e3d26d5b9> 2012-11-30 09:59:55 <gfid:022fd9fc-d725-4066-acbd-1e2b224710b0> ... On Fri, Nov 30, 2012 at 12:33 PM, ZHANG Cheng <czhang.oss at gmail.com> wrote: > I have lots of such line in my log: > [2012-11-30 12:27:22.203030] E > [afr-self-heald.c:685:_link_inode_update_loc] 0-staticvol-replicate-0: > inode link failed on the inode (00000000-0000-0000-0000-000000000000) > > I am running on gluster 3.3.1. > > On Thu, Nov 29, 2012 at 6:58 PM, Jeff Darcy <jdarcy at redhat.com> wrote: >> On 11/26/12 4:46 AM, ZHANG Cheng wrote: >>> Early this morning our 2 bricks replicated cluster had an outage. The >>> disk space for one of the brick server (brick02) was used up. When we >>> responded to the disk full alert, the issue already lasted for a few >>> hours. We reclaimed some disk space, and reboot the brick02 server, >>> expecting once it come back it will go self healing. >>> >>> It did go self healing, but just after couple minutes, access to >>> gluster filesystem freeze. Tons of "nfs: server brick not responding, >>> still trying" popped up in dmesg. The load average on app server went >>> up to 200 something from usual 0.10. We had to shutdown brick02 server >>> or stop gluster server process on it, to get the gluster cluster back >>> working. >> >> Have you checked the glustershd logs (should be in /var/log/glusterfs) >> on the bricks? If there's nothing useful there, a statedump would also >> be useful. See the "gluster volume statedump" instructions on your >> friendly local admin guide (section 10.4 for GlusterFS 3.3). Most >> helpful of all would be a bug report with any of this information plus a >> description of your configuration. You can either create a new one or >> attach the info to an existing bug if one seems to fit. The following >> seems like it might be related, even though it's for virtual machines. >> >> https://bugzilla.redhat.com/show_bug.cgi?id=881685 >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://supercolony.gluster.org/mailman/listinfo/gluster-users