Re: directory ownership bug in gluster 3.4 & 3.5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



hi Peter,
         Sorry for the delay in replying to this mail. I am able to reproduce the bug consistently. Disabling stat-prefetch reduced the number of times the errors come but it hasn't eliminated the issue.

Following the strace output was interesting. The problem always seems to be because the uid is not matching:
stat("/mnt/fuse1/test-target/test1409848960.3", {st_dev=makedev(0, 41), st_ino=12165775161408537538, st_mode=S_IFDIR|0550, st_nlink=2, *st_uid=0*, st_gid=9999, st_blksize=131072, st_blocks=1, st_size=6, st_atime=2014/09/04-22:12:40, st_mtime=2014/09/04-22:12:40, st_ctime=2014/09/04-22:12:40}) = 0

uid is coming as 0 and gid is 9999. If we do a stat after the run is over it is showing things correctly. I am yet to isolate the problem. I will keep you updated once I find something.

Pranith

On 08/22/2014 11:15 PM, Peter Drake wrote:
I have a replicated Gluster setup, 2 servers (fs-1 and fs-2) x 1 brick.  I have two clients (also on fs-1 and fs-2) which mount the Gluster volume at /mnt/gfs (/mnt/gfs type fuse.glusterfs (rw,default_permissions,allow_other,max_read=131072)).  These clients have scripts which perform various file operations.  One operation they perform looks like this (note this is pseudocode, the actual script is PHP):

1. @mkdir(/mnt/gfs/somedir, 0550);
2. chown(1234, /mnt/gfs/somedir);
3. chgrp(1234, /mnt/gfs/somedir);

Note that line 1 may fail on either client because the directory may have been created on the other client.  These errors are suppressed/ignored.  When this operation is performed simultaneously on both clients, it usually succeeds in creating a directory with the expected permissions and ownership.  Intermittently however, we see that these directories are not owned by the expected user and group.

I've created a PHP script which can be run on two clients simultaneously to reproduce the error: https://gist.github.com/pdrakeweb/ae046b4c70a42309be43

The only log entry I can find that appears to be related is from fs-1's mnt-gfs.log file:

[2014-08-22 12:27:57.661778] I [dht-layout.c:640:dht_layout_normalize] 0-test-fs-cluster-1-dht: found anomalies in /test-target/test1408710477.7. holes=1 overlaps=0

This occurs in both Gluster 3.4.1 and 3.5.2 (the only two versions I have tested for this).  I am unable to reproduce the problem on a local (non-gluster) filesystem.  I'd appreciate any insight people might have into what is going on here and whether this is a bug in Gluster.

--
Peter Drake | Cloud Software Engineer |  Acquia

E: peter.drake@xxxxxxxxxx  |  Skype: pdrakeweb

W: http://www.acquia.com

Address: 25 Corporate Drive 4th Floor, Burlington, MA 01803

Acquia named One of America’s Most Promising Companies by Forbes

Drupal Sites: http://drupalshowcase.com





--
Peter Drake | Cloud Software Engineer |  Acquia

O: 781.238.4236

E: peter.drake@xxxxxxxxxx  |  Skype: pdrakeweb

W: http://www.acquia.com

Address: 25 Corporate Drive 4th Floor, Burlington, MA 01803

Acquia named One of America’s Most Promising Companies by Forbes

Drupal Sites: http://drupalshowcase.com



_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux