Hey gluster-users, I just stumbled on a problem in our current test-setup of gluster 3.3.2. This is a simple replicated setup with 2 bricks (on XFS) in 1 volume running on glusterfs version 3.3.2qa3 on ubuntu lucid. The client mounting this volume on /mnt/gfs sits on a mother machine and is using fuse (Version: 2.8.1-1.1ubuntu3.1). On the gluster-fs fuse client mount log: [2013-06-02 21:23:26.677069] W [afr-common.c:1196:afr_detect_self_heal_by_iatt] 0-test-fs-cluster-1-replicate-0: /home/filesshared/README.txt.lock: gfid different on subvolume [2013-06-02 21:23:26.677069] I [afr-self-heal-common.c:1970:afr_sh_post_nb_entrylk_gfid_sh_cbk] 0-test-fs-cluster-1-replicate-0: Non blocking entrylks failed. [2013-06-02 21:23:26.697068] W [client3_1-fops.c:258:client3_1_mknod_cbk] 0-test-fs-cluster-1-client-0: remote operation failed: File exists. Path: /home/filesshared/README.txt.lock (00000000-0000-0000-0000-000000000000) [2013-06-02 21:23:26.697068] W [client3_1-fops.c:258:client3_1_mknod_cbk] 0-test-fs-cluster-1-client-1: remote operation failed: File exists. Path: /home/filesshared/README.txt.lock (00000000-0000-0000-0000-000000000000) [2013-06-02 21:23:26.697068] W [inode.c:914:inode_lookup] (-->/usr/lib/glusterfs/3.3.2qa3/xlator/debug/io-stats.so(io_stats_lookup_cbk+0xff) [0x7fb16c310d8f] (-->/usr/lib/glusterfs/3.3.2qa3/xlator/mount/fuse.so(+0xf248) [0x7fb16fa95248] (-->/usr/lib/glusterfs/3.3.2qa3/xlator/mount/fuse.so(+0xf0b1) [0x7fb16fa950b1]))) 0-fuse: inode not found What the application side is doing when this happened: 1. It created /home/filesshared 2. creates /mnt/gfs/home/filesshared 3. deleted /home/filesshared and replaced it with a symlink from /home/filesshared to /mnt/gfs/home/filesshared 4. Tried to write some files Here's the log for that: 2013-06-02T21:23:26+00:00 daemon.notice web-14 f-c-w[4842]: deploying filesshared.prod 2013-06-02T21:23:26+00:00 daemon.notice web-14 f-c-w[4842]: creating directory: dir=/home/filesshared, user=0, group=filesshared, mode=0550 2013-06-02T21:23:26+00:00 daemon.notice web-14 f-c-w[4842]: creating directory: dir=/mnt/gfs/home/filesshared, user=filesshared, group=filesshared, mode=0700 2013-06-02T21:23:26+00:00 daemon.notice web-14 f-c-w[4842]: created /home/filesshared -> /mnt/gfs/home/filesshared 2013-06-02T21:23:26+00:00 daemon.notice web-14 f-c-w[4842]: PHP Warning: stat(): stat failed for /home/filesshared/README.txt.lock in /usr/ah/lib/ah-lib.php on line 701 2013-06-02T21:23:27+00:00 daemon.notice web-14 f-c-w[4842]: PHP Warning: stat(): stat failed for /home/filesshared/README.txt.lock in /usr/ah/lib/ah-lib.php on line 701 2013-06-02T21:23:27+00:00 daemon.notice web-14 f-c-w[4842]: PHP Warning: stat(): stat failed for /home/filesshared/README.txt.lock in /usr/ah/lib/ah-lib.php on line 701 2013-06-02T21:23:28+00:00 daemon.notice web-14 f-c-w[4842]: PHP Warning: stat(): stat failed for /home/filesshared/README.txt.lock in /usr/ah/lib/ah-lib.php on line 701 What this resulted in: This turned the mount point completely unresponsive. This means that in PHP, file_exists('/mnt/gfs') returns false and stat() calls fail. In Ruby File.directory?('/mnt/gfs') returns false. This can be solved by calling "umount /mnt/gfs" and then remounting the share again from fstab ("mount /mnt/gfs") I could not find any relevant log entries on the bricks themselves. I sadly also wasn't able to come up with a test case to reproduce it. It seems somewhat similar to http://gluster.org/pipermail/gluster-users/2013-March/035662.html I initially thought that this could have been fixed in http://review.gluster.org/#/c/4689/ , but the qa branch we run has this fix backported. Any idea what could cause this behaviour? Cheers, Marc -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130603/f8bca681/attachment.html>