hi Peter,
Sorry for the delay in replying to this mail. I am able to reproduce the bug consistently. Disabling stat-prefetch reduced the number of times the errors come but it hasn't eliminated the issue.
Following the strace output was interesting. The problem always seems to be because the uid is not matching:
stat("/mnt/fuse1/test-target/test1409848960.3", {st_dev=makedev(0, 41), st_ino=12165775161408537538, st_mode=S_IFDIR|0550, st_nlink=2, *st_uid=0*, st_gid=9999, st_blksize=131072, st_blocks=1, st_size=6, st_atime=2014/09/04-22:12:40, st_mtime=2014/09/04-22:12:40, st_ctime=2014/09/04-22:12:40}) = 0
uid is coming as 0 and gid is 9999. If we do a stat after the run is over it is showing things correctly. I am yet to isolate the problem. I will keep you updated once I find something.
Pranith
On 08/22/2014 11:15 PM, Peter Drake wrote:
I have a replicated Gluster setup, 2 servers (fs-1 and fs-2) x 1 brick. I have two clients (also on fs-1 and fs-2) which mount the Gluster volume at /mnt/gfs (/mnt/gfs type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)). These clients have scripts which perform various file operations. One operation they perform looks like this (note this is pseudocode, the actual script is PHP):
1. @mkdir(/mnt/gfs/somedir, 0550);
2. chown(1234, /mnt/gfs/somedir);3. chgrp(1234, /mnt/gfs/somedir);
Note that line 1 may fail on either client because the directory may have been created on the other client. These errors are suppressed/ignored. When this operation is performed simultaneously on both clients, it usually succeeds in creating a directory with the expected permissions and ownership. Intermittently however, we see that these directories are not owned by the expected user and group.
I've created a PHP script which can be run on two clients simultaneously to reproduce the error: https://gist.github.com/pdrakeweb/ae046b4c70a42309be43
The only log entry I can find that appears to be related is from fs-1's mnt-gfs.log file:[2014-08-22 12:27:57.661778] I [dht-layout.c:640:dht_layout_normalize] 0-test-fs-cluster-1-dht: found anomalies in /test-target/test1408710477.7. holes=1 overlaps=0
This occurs in both Gluster 3.4.1 and 3.5.2 (the only two versions I have tested for this). I am unable to reproduce the problem on a local (non-gluster) filesystem. I'd appreciate any insight people might have into what is going on here and whether this is a bug in Gluster.
--
Peter Drake | Cloud Software Engineer | Acquia
E: peter.drake@xxxxxxxxxx | Skype: pdrakeweb
Address: 25 Corporate Drive 4th Floor, Burlington, MA 01803
Acquia named One of America’s Most Promising Companies by Forbes
Drupal Sites: http://drupalshowcase.com
Twitter http://www.twitter.com/Acquia
--
Peter Drake | Cloud Software Engineer | Acquia
E: peter.drake@xxxxxxxxxx | Skype: pdrakeweb
Address: 25 Corporate Drive 4th Floor, Burlington, MA 01803
Acquia named One of America’s Most Promising Companies by Forbes
Drupal Sites: http://drupalshowcase.com
Twitter http://www.twitter.com/Acquia
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-users