On Tue, 2015-08-11 at 11:14 +0530, Atin Mukherjee wrote: > > On 08/11/2015 10:44 AM, Kingsley wrote: > > On Tue, 2015-08-11 at 07:48 +0530, Atin Mukherjee wrote: > > > >> -Atin > >> Sent from one plus one > >> On Aug 10, 2015 11:58 PM, "Kingsley" <gluster@xxxxxxxxxxxxxxxxxxx> > >> wrote: > >>> > >>> > >>> On Mon, 2015-08-10 at 22:53 +0530, Atin Mukherjee wrote: > >>> [snip] > >>>> > >>>>> stat("/sys/fs/selinux", {st_mode=S_IFDIR|0755, st_size=0, ...}) = > >> 0 > >>>> > >>>>> brk(0) = 0x8db000 > >>>>> brk(0x8fc000) = 0x8fc000 > >>>>> mkdir("test", 0777 > >>>> Can you also collect the statedump of all the brick processes when > >> the command is hung? > >>>> > >>>> + Ravi, could you check this? > >>> > >>> > >>> I ran the command but I could not find where it put the output: > > > > [snip] > > > >>> Where should I find the output of the statedump command? > >> It should be there in var/run/gluster folder > > > > > > Thanks - replied offlist. > Could you forward the statedump details to Ravi as well? (In cc) Hi, It appears that the volume may have repaired itself, which is a pleasing outcome. The "strace mkdir test" command in the broken directory finally came back (the output previously ended at 'mkdir("test", 0777' [without the single quotes]), but I've now seen that it has completed (see below). I've no idea what time it actually finished, but I suspect it was hours later; the output finally ended: mkdir("test", 0777) = 0 close(1) = 0 close(2) = 0 exit_group(0) = ? +++ exited with 0 +++ I just tested "mkdir test2" in the same directory and it worked perfectly. What's more, the directories both exist as they should: [root@voicemail1b-1 14391.broken]# ls -ld test* drwxr-xr-x. 2 root root 10 Aug 11 05:46 test drwxr-xr-x. 2 root root 10 Aug 11 09:03 test2 [root@voicemail1b-1 14391.broken]# Volume heal no longer claims anything is happening: [root@gluster1b-1 14391]# gluster volume heal callrec info Brick gluster1a-1.dns99.co.uk:/data/brick/callrec/ Number of entries: 0 Brick gluster1b-1.dns99.co.uk:/data/brick/callrec/ Number of entries: 0 Brick gluster2a-1.dns99.co.uk:/data/brick/callrec/ Number of entries: 0 Brick gluster2b-1.dns99.co.uk:/data/brick/callrec/ Number of entries: 0 [root@gluster1b-1 14391]# Because of the job backlog from yesterday, the system was very disk I/O bound, which was slowing everything right down. Obviously this wouldn't have helped a self heal, though I've no idea how long that would normally take. Cheers, Kingsley. _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users