On 05/17/2011 08:08 AM, Martin Schenker wrote: > This is an inherited system, I guess it was set up by hand. I guess I can > switch off these options, but the glusterd service will have to be > restarted, right?!? Yes. > I'm also getting current error messages like these on the peer pair 3&5: > > Pserver3 > [2011-05-17 10:06:28.540355] E [rpc-clnt.c:199:call_bail] > 0-storage0-client-2: bailing out frame type(GlusterFS 3.1) op(FINODELK(30)) > xid = 0x805809xsent = 2011-05-17 09:36:18.393519. timeout = 1800 Hmmm ... Looks like others have seen this before. Error message suggests some sort of protocol error. Its in a code path in rpc/rpc-lib/src/rpc-clnt.c and the function is named "call_bail". This code looks like it is part of a timeout callback (I am guessing when it doesn't get a response in time, and the timer is hard coded to 10 seconds). There is a note there with a TODO about making that configurable. If the machine is under tremendous load, it is possible that a response is delayed more than 10 seconds, so that this portion of the code falls through to the timeout, rather than processing an rpc call). > > Pserver5 > [2011-05-17 10:02:23.738887] E [dht-common.c:1873:dht_getxattr] > 0-storage0-dht: layout is NULL > [2011-05-17 10:02:23.738909] W [fuse-bridge.c:2499:fuse_xattr_cbk] > 0-glusterfs-fuse: 489090: GETXATTR() > /images/2078/ebb83b05-3a83-9d18-ad8f-8542864da > 6ef/hdd-images/21351 => -1 (No such file or directory) > [2011-05-17 10:02:23.738954] W [fuse-bridge.c:660:fuse_setattr_cbk] > 0-glusterfs-fuse: 489091: SETATTR() > /images/2078/ebb83b05-3a83-9d18-ad8f-8542864da > 6ef/hdd-images/21351 => -1 (Invalid argument) > > Best, Martin > > -----Original Message----- > From: gluster-users-bounces at gluster.org > [mailto:gluster-users-bounces at gluster.org] On Behalf Of Joe Landman > Sent: Tuesday, May 17, 2011 1:54 PM > To: gluster-users at gluster.org > Subject: Re: Client and server file "view", different > results?! Client can't see the right file. > > On 05/17/2011 01:43 AM, Martin Schenker wrote: >> Yes, it is! >> >> Here's the volfile: >> >> cat /mnt/gluster/brick0/config/vols/storage0/storage0-fuse.vol: >> >> volume storage0-client-0 >> type protocol/client >> option remote-host de-dc1-c1-pserver3 >> option remote-subvolume /mnt/gluster/brick0/storage >> option transport-type rdma >> option ping-timeout 5 >> end-volume > > Hmmm ... did you create these by hand or using the CLI? > > I noticed quick-read and stat-cache on. We recommend turning both of > them off. We experienced many issues with them on (from gluster 3.x.y) > -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics, Inc. email: landman at scalableinformatics.com web : http://scalableinformatics.com http://scalableinformatics.com/sicluster phone: +1 734 786 8423 x121 fax : +1 866 888 3112 cell : +1 734 612 4615