Problems with running long jobs on a replicated volume.

peter.linder at fiberdirekt.se (Peter Linder) · Tue, 25 Oct 2011 14:06:26 +0200

I tried running my find command for a few hours, but my mount point kept 
working although I do get many stale nfs file handle error messages. I 
do not know if you have the same problem as I do, but it does at least 
sound similar.

I believe work on investigating my bug report is on-going, so we will 
see. Until then, mounting with the fuse client should work but nfs has 
more aggressive caching on the client side which is nice for many small 
files so it would be nice to switch :)

On 10/25/2011 2:02 PM, Tiago Carmona wrote:
> I'm still having this problem. Someone has any thought about this error?
>
> Thanks,
> Tiago Carmona
>
>
> On Mon, Oct 17, 2011 at 1:20 PM, Tiago Carmona 
> <carmona.tiago at gmail.com <mailto:carmona.tiago at gmail.com>> wrote:
>
>     Peter,
>
>     It seens, at least on my volume, that the find command doesn't
>     break it, as I've successfully run a self healing on it.
>
>     But yeah, the problem seens to be related. Does anyone else had a
>     problem like this?
>
>     Thanks,
>     Tiago Carmona
>
>
>
>     On Mon, Oct 17, 2011 at 1:04 PM, Peter Linder
>     <peter.linder at fiberdirekt.se <mailto:peter.linder at fiberdirekt.se>>
>     wrote:
>
>         Perhaps it is similar to the problem I have, see:
>         http://bugs.gluster.com/show_bug.cgi?id=3712
>
>         I will try perhaps tonight to leave my find command running
>         and see if that eventually breaks the mount point.
>
>         On 10/17/2011 4:11 PM, Tiago Carmona wrote:
>>         First of all, hi guys. My name is Tiago Carmona and I'm a
>>         DevOps to be at Unicamp in Brazil. I started using glusterFS
>>         not a long time ago, but I'm loving it. I also would like to
>>         say thanks for all the help I've got on IRC.
>>
>>         I'm having a problem with running long jobs on a replicated
>>         volume. When I run a long job (like a chmod -R on my mount
>>         root), I got many "NFS stale handler" errors, and after some
>>         time my mount point is down with a "Transport endpoint is not
>>         connected" error, so I need to umount and mount it again. I
>>         think that my error is similar to the one at
>>         http://gluster.org/pipermail/gluster-users/2011-April/007192.html
>>         , from this list. Does anyone know what may be causing this?
>>
>>         I'm running glusterfs on two gentoo machines. Version info
>>         bellow:
>>
>>         glusterfs 3.2.3 built on Sep  4 2011 10:12:37
>>         Repository revision: git://git.gluster.com/glusterfs.git
>>         <http://git.gluster.com/glusterfs.git>
>>         Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
>>
>>         Many thanks for all,
>>         Tiago Carmona
>>
>>
>>         _______________________________________________
>>         Gluster-users mailing list
>>         Gluster-users at gluster.org  <mailto:Gluster-users at gluster.org>
>>         http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
>
>         _______________________________________________
>         Gluster-users mailing list
>         Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
>         http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gluster.org/pipermail/gluster-users/attachments/20111025/ed58e835/attachment.htm>