I am trying to review what may cause it, let me get back as soon as I have a fix. Regards, Amar On Fri, Apr 25, 2008 at 6:31 PM, Christopher Hawkins < chawkins@xxxxxxxxxxxxxxxxxxxx> wrote: > I am having the same issue. I'm working on a diskless > node cluster and figured the issue was related to that > since AFR seems to fail over nicely for everyone else... > But it seems I am not alone, so what can I do to help troubleshoot? > > I have two servers exporting a brick each, and a client mounting > them both with AFR and no unify. Transport timeout settings > don't seem to make a difference - client is just hung if I power off > or just stop glusterfsd. There is nothing logged on the server side. > I'll use a usb thumb drive for client side logging since any logs in > the ramdisk obviously disappear after the reboot which fixes the hang... > If I get any insight from this I'll report it asap. > > Thanks, > Chris > > > Real simple, two bricks on ext3 with user_xattr. > > It is storage for mailstore. The issue that I've been > > battling is that when one of the machines crash, the other > > machine loses the mailstore with either the transport > > endpoint disconnect or the glusterfs filesystem is hung. You > > cannot do anything with it. 'ls' it, 'df' it, ... nothing. > > If I try to kill glusterfs/d it just gives me /glusterfsmount > > busy. The only recovery at this point is to reboot the good > > machine as well as the failed machine. So needing to do that > > is sort of defeating my purpose of creating this array. Is > > there no way that glusterfs can recover from the crash such > > that things are still good on the other bricks and mounts on > > other machines? > > > > Thanks, > > Gerry > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxx > http://lists.nongnu.org/mailman/listinfo/gluster-devel > -- Amar Tumballi Gluster/GlusterFS Hacker [bulde on #gluster/irc.gnu.org] http://www.zresearch.com - Commoditizing Super Storage!