Thanks guys for answering. GlusterFS-1.3.10 is used on both client and server. Could not get the patched fuse to compile in Hardy (error below) so until that works I just took a binary .deb package even though 1.3.12 was available. FUSE error: /root/dload/fuse-2.7.2glfs9/kernel/dir.c:1027: error: 'struct iattr' has no member named 'ia_file' Linux mapreduce8 2.6.24-19-server #1 SMP Fri Jul 11 21:50:43 UTC 2008 x86_64 GNU/Linux libfuse2 2.7.2-1ubuntu2 On the client which drops the mount point: Linux tailsweep2 2.6.17-12-server #2 SMP Thu Jan 31 22:15:27 UTC 2008 i686 GNU/Linux libfuse2 2.7.0-1ubuntu5 This one compiles the patched fuse 2.7.0... What we are trying to achieve with the config is something similar to two replicas of each file spread with afr on three nodes and aggregated with unify. Config files: Server: volume home type storage/posix option directory /srv/export/home end-volume volume server type protocol/server subvolumes home option transport-type tcp/server # For TCP/IP transport option auth.ip.home.allow * end-volume Client: volume v1 type protocol/client option transport-type tcp/client option remote-host 192.168.10.30 option remote-subvolume home end-volume volume v2 type protocol/client option transport-type tcp/client option remote-host 192.168.10.31 option remote-subvolume home end-volume volume v3 type protocol/client option transport-type tcp/client option remote-host 192.168.10.32 option remote-subvolume home end-volume volume afr-1 type cluster/afr subvolumes v1 v2 end-volume volume afr-2 type cluster/afr subvolumes v2 v3 end-volume volume afr-3 type cluster/afr subvolumes v3 v1 end-volume volume ns1 type protocol/client option transport-type tcp/client option remote-host 192.168.10.30 option remote-subvolume home-namespace end-volume volume ns2 type protocol/client option transport-type tcp/client option remote-host 192.168.10.31 option remote-subvolume home-namespace end-volume volume ns3 type protocol/client option transport-type tcp/client option remote-host 192.168.10.32 option remote-subvolume home-namespace end-volume volume namespace type cluster/afr subvolumes ns1 ns2 ns3 end-volume volume v type cluster/unify option scheduler rr option namespace namespace subvolumes afr-1 afr-2 afr-3 end-volume I really hope we have misconfigured something since that is the easiest fix :) Kindly //Marcus On Sat, Sep 13, 2008 at 12:50 AM, Amar S. Tumballi <amar@xxxxxxxxxxxxx>wrote: > Also which version of GlusterFS? > > ber@xxxxxxxxxxxxx> > >> may be configuration issue... lets start with config, what does you >> config look like on client and server? >> >> Marcus Herou wrote: >> >>> Lots of these on server >>> 2008-09-12 20:48:14 E [protocol.c:271:gf_block_unserialize_transport] >>> server: EOF from peer (*MailScanner has detected a possible fraud attempt >>> from "192.168.10.4:1007" claiming to be* *MailScanner warning: numerical >>> links are often malicious:* 192.168.10.4:1007 <http://192.168.10.4:1007 >>> >) >>> ... >>> 2008-09-12 20:50:12 E [server-protocol.c:4153:server_closedir] server: >>> not getting enough data, returning EINVAL >>> ... >>> 2008-09-12 20:50:12 E [server-protocol.c:4148:server_closedir] server: >>> unresolved fd 6 >>> ... >>> 2008-09-12 20:51:47 E [protocol.c:271:gf_block_unserialize_transport] >>> server: EOF from peer (*MailScanner has detected a possible fraud attempt >>> from "192.168.10.10:1015" claiming to be* *MailScanner warning: >>> numerical links are often malicious:* 192.168.10.10:1015 < >>> http://192.168.10.10:1015>) >>> >>> ... >>> >>> And lots of these on client >>> >>> 2008-09-12 19:54:45 E [afr.c:2201:afr_open] home-namespace: self heal >>> failed, returning EIO >>> 2008-09-12 19:54:45 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse: >>> 3954: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5) >>> 2008-09-12 19:54:45 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse: >>> 3956: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5) >>> 2008-09-12 19:54:45 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse: >>> 3958: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5) >>> 2008-09-12 19:54:45 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse: >>> 3987: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5) >>> 2008-09-12 19:54:45 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse: >>> 3989: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5) >>> 2008-09-12 19:54:45 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse: >>> 3991: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5) >>> 2008-09-12 19:54:45 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse: >>> 3993: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5) >>> 2008-09-12 19:54:54 C [client-protocol.c:212:call_bail] home3: bailing >>> transport >>> 2008-09-12 19:54:54 E [client-protocol.c:4827:client_protocol_cleanup] >>> home3: forced unwinding frame type(2) op(5) reply=@0x809abb0 >>> 2008-09-12 19:54:54 E [client-protocol.c:4239:client_lock_cbk] home3: no >>> proper reply from server, returning ENOTCONN >>> 2008-09-12 19:54:54 E [afr.c:1933:afr_selfheal_lock_cbk] home-afr-3: >>> (path=/rsyncer/.ssh/authorized_keys2 child=home3) op_ret=-1 op_errno=107 >>> 2008-09-12 19:54:54 E [afr.c:2201:afr_open] home-afr-3: self heal failed, >>> returning EIO >>> 2008-09-12 19:54:54 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse: >>> 3970: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5) >>> 2008-09-12 19:54:54 E [client-protocol.c:4827:client_protocol_cleanup] >>> home3: forced unwinding frame type(2) op(5) reply=@0x809abb0 >>> 2008-09-12 19:54:54 E [client-protocol.c:4239:client_lock_cbk] home3: no >>> proper reply from server, returning ENOTCONN >>> 2008-09-12 19:54:54 E [afr.c:1933:afr_selfheal_lock_cbk] home-afr-3: >>> (path=/rsyncer/.ssh/authorized_keys2 child=home3) op_ret=-1 op_errno=107 >>> 2008-09-12 19:54:54 E [afr.c:2201:afr_open] home-afr-3: self heal failed, >>> returning EIO >>> 2008-09-12 19:54:54 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse: >>> 3971: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5) >>> 2008-09-12 19:54:54 E [client-protocol.c:4827:client_protocol_cleanup] >>> home3: forced unwinding frame type(2) op(5) reply=@0x809abb0 >>> 2008-09-12 19:54:54 E [client-protocol.c:4239:client_lock_cbk] home3: no >>> proper reply from server, returning ENOTCONN >>> 2008-09-12 19:54:54 E [afr.c:1933:afr_selfheal_lock_cbk] home-afr-3: >>> (path=/rsyncer/.ssh/authorized_keys2 child=home3) op_ret=-1 op_errno=107 >>> 2008-09-12 19:54:54 E [afr.c:2201:afr_open] home-afr-3: self heal failed, >>> returning EIO >>> 2008-09-12 19:54:54 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse: >>> 3972: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5) >>> 2008-09-12 19:54:54 E [client-protocol.c:4827:client_protocol_cleanup] >>> home3: forced unwinding frame type(2) op(5) reply=@0x809abb0 >>> 2008-09-12 19:54:54 E [client-protocol.c:4239:client_lock_cbk] home3: no >>> proper reply from server, returning ENOTCONN >>> 2008-09-12 19:54:54 E [afr.c:1933:afr_selfheal_lock_cbk] home-afr-3: >>> (path=/rsyncer/.ssh/authorized_keys2 child=home3) op_ret=-1 op_errno=107 >>> 2008-09-12 19:54:54 E [afr.c:2201:afr_open] home-afr-3: self heal failed, >>> returning EIO >>> 2008-09-12 19:54:54 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse: >>> 3974: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5) >>> 2008-09-12 19:54:54 E [client-protocol.c:4827:client_protocol_cleanup] >>> home3: forced unwinding frame type(2) op(5) reply=@0x809abb0 >>> 2008-09-12 19:54:54 E [client-protocol.c:4239:client_lock_cbk] home3: no >>> proper reply from server, returning ENOTCONN >>> 2008-09-12 19:54:54 E [afr.c:1933:afr_selfheal_lock_cbk] home-afr-3: >>> (path=/rsyncer/.ssh/authorized_keys2 child=home3) op_ret=-1 op_errno=107 >>> 2008-09-12 19:54:54 E [afr.c:2201:afr_open] home-afr-3: self heal failed, >>> returning EIO >>> 2008-09-12 19:54:54 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse: >>> 4001: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5) >>> 2008-09-12 19:54:54 E [client-protocol.c:4827:client_protocol_cleanup] >>> home3: forced unwinding frame type(2) op(5) reply=@0x809abb0 >>> 2008-09-12 19:54:54 E [client-protocol.c:4239:client_lock_cbk] home3: no >>> proper reply from server, returning ENOTCONN >>> 2008-09-12 19:54:54 E [afr.c:1933:afr_selfheal_lock_cbk] home-afr-3: >>> (path=/rsyncer/.ssh/authorized_keys2 child=home3) op_ret=-1 op_errno=107 >>> 2008-09-12 19:54:54 E [afr.c:2201:afr_open] home-afr-3: self heal failed, >>> returning EIO >>> 2008-09-12 19:54:54 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse: >>> 4002: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5) >>> 2008-09-12 19:54:54 E [client-protocol.c:4827:client_protocol_cleanup] >>> home3: forced unwinding frame type(2) op(5) reply=@0x809abb0 >>> 2008-09-12 19:54:54 E [client-protocol.c:4239:client_lock_cbk] home3: no >>> proper reply from server, returning ENOTCONN >>> 2008-09-12 19:54:54 E [afr.c:1933:afr_selfheal_lock_cbk] home-afr-3: >>> (path=/rsyncer/.ssh/authorized_keys2 child=home3) op_ret=-1 op_errno=107 >>> 2008-09-12 19:54:54 E [afr.c:2201:afr_open] home-afr-3: self heal failed, >>> returning EIO >>> 2008-09-12 19:54:54 E [fuse-bridge.c:715:fuse_fd_cbk] glusterfs-fuse: >>> 4004: (12) /rsyncer/.ssh/authorized_keys2 => -1 (5) >>> 2008-09-12 19:55:01 E [unify.c:335:unify_lookup] home: returning ESTALE >>> for /rsyncer/.ssh/authorized_keys2: file count is 4 >>> 2008-09-12 19:55:01 E [unify.c:339:unify_lookup] home: >>> /rsyncer/.ssh/authorized_keys2: found on home-namespace >>> 2008-09-12 19:55:01 E [unify.c:339:unify_lookup] home: >>> /rsyncer/.ssh/authorized_keys2: found on home-afr-2 >>> 2008-09-12 19:55:01 E [unify.c:339:unify_lookup] home: >>> /rsyncer/.ssh/authorized_keys2: found on home-afr-1 >>> 2008-09-12 19:55:01 E [unify.c:339:unify_lookup] home: >>> /rsyncer/.ssh/authorized_keys2: found on home-afr-3 >>> >>> >>> Both server and client are spitting out tons of these. Thought "E" was >>> Error level, seems like DEBUG ? >>> >>> Kindly >>> >>> //Marcus >>> >>> >>> >>> >>> On Fri, Sep 12, 2008 at 8:01 PM, Brian Taber <btaber@xxxxxxxxxxxxx<mailto: >>> btaber@xxxxxxxxxxxxx>> wrote: >>> >>> What do you see in your server and client logs for gluster? >>> >>> ------------------------- >>> Brian Taber >>> Owner/IT Specialist >>> Diverse Computer Group >>> Office: 774-206-5592 >>> Cell: 508-496-9221 >>> btaber@xxxxxxxxxxxxx <mailto:btaber@xxxxxxxxxxxxx> >>> >>> >>> >>> >>> Marcus Herou wrote: >>> > Hi. >>> > >>> > We have just recently installed a 3 node cluster with 16 SATA >>> disks each. >>> > >>> > We are using Hardy and the glusterfs-3.10 Ubuntu package on both >>> client(s) >>> > and server. >>> > >>> > We have only created one export (/home) yet since we want to >>> test it a while >>> > before putting it into a live high performance environment. >>> > >>> > The problem is currently that the client looses /home once a day >>> or so. This >>> > is really bad since this is a machine which all other connect to >>> with ssh >>> > keys thus making them unable to log in. >>> > >>> > Anyone seen something similar ? >>> > >>> > Kindly >>> > >>> > //Marcus >>> > _______________________________________________ >>> > Gluster-devel mailing list >>> > Gluster-devel@xxxxxxxxxx <mailto:Gluster-devel@xxxxxxxxxx> >>> > http://lists.nongnu.org/mailman/listinfo/gluster-devel >>> > >>> >>> >>> >>> >>> -- >>> Marcus Herou CTO and co-founder Tailsweep AB >>> +46702561312 >>> marcus.herou@xxxxxxxxxxxxx <mailto:marcus.herou@xxxxxxxxxxxxx> >>> http://www.tailsweep.com/ >>> http://blogg.tailsweep.com/ >>> >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel@xxxxxxxxxx >> http://lists.nongnu.org/mailman/listinfo/gluster-devel >> > > > > -- > Amar Tumballi > Gluster/GlusterFS Hacker > [bulde on #gluster/irc.gnu.org] > http://www.zresearch.com - Commoditizing Super Storage! > -- Marcus Herou CTO and co-founder Tailsweep AB +46702561312 marcus.herou@xxxxxxxxxxxxx http://www.tailsweep.com/ http://blogg.tailsweep.com/