On 21 December 2014 06:37:44 CET, tbenzvi@xxxxxxxxxxxxxxx wrote: >Hi Joe, > >Thanks for the reply. That worked; I probably forgot to do this as root >last time. Yet, the files still show up twice in a directory listing on >the mounted volume. And it seems to be random whether reading the file >will succeed or not. I've tried with several files and it sometimes >works and sometimes fails; I assume this depends on whether it locates >the actual file on the brick or the link file. Let me know if you have >any idea what's going on. Does the brick filesystem happen to be ext4? I havs hed the similar problem with 3.6.x and ext4 (64 bit offset problem). > >Output of the command: > >$ getfattr -m . -d -e hex >/data/glusterfs/safari/brick01/brick/rsc/tsx/montreal_smaller/sm_asc/stack/slc/20130210.slc.ras >getfattr: Removing leading '/' from absolute path names ># file: >data/glusterfs/safari/brick01/brick/rsc/tsx/montreal_smaller/sm_asc/stack/slc/20130210.slc.ras >system.posix_acl_access=0x0200000001000600ffffffff04000600ffffffff10000600ffffffff20000400ffffffff >trusted.SGI_ACL_FILE=0x0000000400000001ffffffff0006000000000004ffffffff0006000000000010ffffffff0006000000000020ffffffff00040000 >trusted.gfid=0x52c2aed77d09412d8bfd7ca70e87b196 >trusted.glusterfs.dht.linkto=0x7361666172692d636c69656e742d3200 > > >Cheers, >Tom > >--------- Original Message --------- Subject: Re: >Hundreds of duplicate files >From: "Joe Julian" <joe@xxxxxxxxxxxxxxxx> >Date: 12/20/14 8:53 pm >To: gluster-users@xxxxxxxxxxx > >Try 'getfattr -m . -d -e hex' (dot instead of dash) and, of course, do >that as root. > > On 12/20/2014 06:02 PM, tbenzvi@xxxxxxxxxxxxxxx wrote: > Hi everyone, > >We have a distributed Gluster volume on five bricks over two servers >(first server running gluster 3.4.2, second server running gluster >3.5.1, both running Fedora 20) >Starting last week, doing a file listing on the mounted volume shows >many files with the same name appearing twice (and they are listed with >the same inode). Doing a search for these files, I have found 290,000 >of them!! > >If I do a listing of these files on the bricks themselves, it looks >like most are link files (du will show the file on the first server as >0 bytes, and the sticky bit set). The file is fine on the second >server. Unfortunately, running "getfattr -m - -e hex -d" on the file >shows NO gluster-related attributes and I believe this is why both >files appear in the listing. The files cannot be read by any programs >as it is trying to read the link file. I assume the metadata became >corrupted. This is a production server so we really need to know: > >1. How did this happen, and how can we prevent it going forward? There >was a server crash a week ago and I believe that was the cause. >2. How can we heal the Gluster volume/bricks and link files. If there >is some straightforward way of restoring the link file pointer I can >write a script to do it, obviously doing this manually will be >impossible. > >Thanks very much for any and all help - much appreciated! > >Regards, >Tom > > >On Wed, Dec 17, 2014 at 4:07 AM, <tbenzvi@xxxxxxxxxxxxxxx> wrote: >> Hi everyone, we have noticed some extremely odd behaviour with our > > distributed Gluster volume where duplicate files (same name, same or >> different content) are being created and stored on multiple bricks. >The only >> consistent clue is that one of the duplicate files has the sticky bit >set. I >> am hoping someone will be able to shed some light on why this is >happening >> and how we can restore the volume as there appear to be hundreds of >such > > files. I will try to provide as much pertinent information as I can. > > >> We have a 130TB Gluster volume consisting of two 20TB bricks on >server1, and > > three 40TB bricks on a server2 which were added at a later date (and >> rebalancing was done). The volume is mounted on server1, and accessed >only >> through this server but by many users. Both servers went down due to >power >> loss several days ago after which this problem was first noticed. We >ran a > > rebalance command on the volumes, this has not fixed the problem. > > > > > > Gluster volume info: > > Volume Name: safari > > Type: Distribute > > Volume ID: d48d0e6b-4389-4c2c-8fd1-cd2854121eda > > Status: Started > > Number of Bricks: 5 > > Transport-type: tcp > > Bricks: > > Brick1: server1:/data/glusterfs/safari/brick00/brick > > Brick2: server1:/data/glusterfs/safari/brick01/brick > > Brick3: server2:/data/glusterfs/safari/brick02/brick > > Brick4: server2:/data/glusterfs/safari/brick03/brick > > Brick5: server2:/data/glusterfs/safari/brick04/brick > > > > > > Size information: > > /dev/sdc 37T 16T 22T 42% /data/glusterfs/safari/brick02 > > /dev/sdd 37T 16T 22T 42% /data/glusterfs/safari/brick03 > > /dev/sde 37T 17T 21T 45% /data/glusterfs/safari/brick04 > > /dev/md126 11T 7.7T 2.8T 74% /data/glusterfs/safari/brick00 > > /dev/md124 11T 8.0T 2.5T 77% /data/glusterfs/safari/brick01 > > server2:/safari 130T 63T 68T 48% /sar > > > > > > Example 1: > > -Two files with the same name exist in one directory > > -They have different contents and attributes > > -A file listing on the mounted volume shows the same inode > > -The newer file has sticky bit set >> -Neither file is corrupted, they can both be viewed by using the >absolute > > path (on the bricks) > > > > File listing on the mounted volume >> 13036730497538635177 -rw-rw-r-T 1 jon users 924 Dec 15 10:42 RSLC_tab > > 13036730497538635177 -rw-rw-r-- 1 jon users 418 Mar 18 2013 RSLC_tab > > > > Listing of the files on the bricks: > > 8925798411 -rw-rw-r-T+ 2 jon users 924 Dec 15 10:42 >> >/data/glusterfs/safari/brick00/brick/complete/shm/rs2/ottawa/mf6_asc/stack_org/RSLC_tab > > 51541886672 -rw-rw-r--+ 2 1002 users 418 Mar 18 2013 >> >/data/glusterfs/safari/brick02/brick/complete/shm/rs2/ottawa/mf6_asc/stack_org/RSLC_tab > > > > > > Example 2: > > -Two files with the same name exist in one directory > > -They have the same content and attributes >> -No sticky bit is set when looking at file listing on the mounted >volume >> -Sticky bit is set for one while when looking at file listing on the >bricks > > -Files are corrupted > > > > File listing on the mounted volume: > > 13012555852904096080 -rw-rw-r-- 1 tom users 2393848 Dec 8 2013 > > ifg_lr/20130226_20130813.diff.phi.ras > > 13012555852904096080 -rw-rw-r-- 1 tom users 2393848 Dec 8 2013 > > ifg_lr/20130226_20130813.diff.phi.ras > > > > Listing of the files on the bricks: > > 17058578 -rw-rw-r-T+ 2 tom users 2393848 Dec 13 17:11 >> >/data/glusterfs/safari/brick00/brick/rsc/rs2/calgary/u22_dsc/stack_org/ifg_lr/20130226_20130813.diff.phi.ras > > 57986922129 -rw-rw-r--+ 2 1010 users 2393848 Dec 8 2013 >> >/data/glusterfs/safari/brick02/brick/rsc/rs2/calgary/u22_dsc/stack_org/ifg_lr/20130226_20130813.diff.phi.ras > > > > > > Additionally, only some files in this directory are duplicated. The >> duplicated files are corrupted (can not be viewed as Raster images: >the > > original file type) > > The files which are not duplicated are not corrupted. > > > > File command: (notice duplicate and singleton files) >> ifg_lr/20091021_20100218.diff.phi.ras: Sun raster image data, 1208 x >1981, > > 8-bit, RGB colormap > > ifg_lr/20091021_20101016.diff.phi.ras: data > > ifg_lr/20091021_20101016.diff.phi.ras: data >> ifg_lr/20091021_20101109.diff.phi.ras: Sun raster image data, 1208 x >1981, > > 8-bit, RGB colormap >> ifg_lr/20091021_20101203.diff.phi.ras: Sun raster image data, 1208 x >1981, > > 8-bit, RGB colormap >> ifg_lr/20091021_20101227.diff.phi.ras: Sun raster image data, 1208 x >1981, > > 8-bit, RGB colormap >> ifg_lr/20091021_20110120.diff.phi.ras: Sun raster image data, 1208 x >1981, > > 8-bit, RGB colormap > > ifg_lr/20091021_20110213.diff.phi.ras: data > > ifg_lr/20091021_20110213.diff.phi.ras: data > > ifg_lr/20091021_20110309.diff.phi.ras: data > > ifg_lr/20091021_20110309.diff.phi.ras: sticky data >> ifg_lr/20091021_20110402.diff.phi.ras: Sun raster image data, 1208 x >1981, > > 8-bit, RGB colormap > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-users >_______________________________________________ Gluster-users mailing >list Gluster-users@xxxxxxxxxxx >http://www.gluster.org/mailman/listinfo/gluster-users > > >------------------------------------------------------------------------ > >_______________________________________________ >Gluster-users mailing list >Gluster-users@xxxxxxxxxxx >http://www.gluster.org/mailman/listinfo/gluster-users -- Sent from my Android device with K-9 Mail. Please excuse my brevity. _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users