Fw: Re[2]: missing files

"David F. Robinson" <david.robinson@xxxxxxxxxxxxx> · Tue, 10 Feb 2015 14:58:51 +0000

Forwarding to devel list as recommended by Justin...

David

------ Forwarded Message ------
From: "David F. Robinson" <david.robinson@xxxxxxxxxxxxx>
To: "Justin Clift" <justin@xxxxxxxxxxx>
Sent: 2/10/2015 9:49:09 AM
Subject: Re[2]:  missing files

Bad news... I don't think it is the old linkto files. Bad because if 
that was the issue, cleaning up all of bad linkto files would have fixed 
the issue. It seems like the system just gets slower as you add data.

First, I setup a new clean volume (test2brick) on the same system as the 
old one (homegfs_bkp). See 'gluster v info' below. I ran my simple tar 
extraction test on the new volume and it took 58-seconds to complete 
(which, BTW, is 10-seconds faster than my old non-gluster system, so 
kudos). The time on homegfs_bkp is 19-minutes.

Next, I copied 10-terabytes of data over to test2brick and re-ran the 
test which then took 7-minutes. I created a test3brick and ran the test 
and it took 53-seconds.

To confirm all of this, I deleted all of the data from test2brick and 
re-ran the test. It took 51-seconds!!!

BTW. I also checked the .glusterfs for stale linkto files (find . -type 
f -size 0 -perm 1000 -exec ls -al {} \;). There are many, many thousands 
of these types of files on the old volume and none on the new one, so I 
don't think this is related to the performance issue.

Let me know how I should proceed. Send this to devel list? Pranith? 
others? Thanks...

[root@gfs01bkp .glusterfs]# gluster volume info homegfs_bkp
Volume Name: homegfs_bkp
Type: Distribute
Volume ID: 96de8872-d957-4205-bf5a-076e3f35b294
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: gfsib01bkp.corvidtec.com:/data/brick01bkp/homegfs_bkp
Brick2: gfsib01bkp.corvidtec.com:/data/brick02bkp/homegfs_bkp

[root@gfs01bkp .glusterfs]# gluster volume info test2brick
Volume Name: test2brick
Type: Distribute
Volume ID: 123259b2-3c61-4277-a7e8-27c7ec15e550
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: gfsib01bkp.corvidtec.com:/data/brick01bkp/test2brick
Brick2: gfsib01bkp.corvidtec.com:/data/brick02bkp/test2brick

[root@gfs01bkp glusterfs]# gluster volume info test3brick
Volume Name: test3brick
Type: Distribute
Volume ID: 9b1613fc-f7e5-4325-8f94-e3611a5c3701
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: gfsib01bkp.corvidtec.com:/data/brick01bkp/test3brick
Brick2: gfsib01bkp.corvidtec.com:/data/brick02bkp/test3brick

From homegfs_bkp:
# find . -type f -size 0 -perm 1000 -exec ls -al {} \;
--------T 2 gmathur pme_ics 0 Jan 9 16:59 
./00/16/00169a69-1a7a-44c9-b2d8-991671ee87c4
---------T 3 jcowan users 0 Jan 9 17:51 
./00/16/0016a0a0-fd22-4fb5-b6fb-5d7f9024ab74
---------T 2 morourke sbir 0 Jan 9 18:17 
./00/16/0016b36f-32fc-4f2c-accd-e36be2f6c602
---------T 2 carpentr irl 0 Jan 9 18:52 
./00/16/00163faf-741c-4e40-8081-784786b3cc71
---------T 3 601 raven 0 Jan 9 22:49 
./00/16/00163385-a332-4050-8104-1b1af6cd8249
---------T 3 bangell sbir 0 Jan 9 22:56 
./00/16/00167803-0244-46de-8246-d9c382dd3083
---------T 2 morourke sbir 0 Jan 9 23:17 
./00/16/00167bc5-fc56-42ee-9e3f-1e238f3828f4
---------T 3 morourke sbir 0 Jan 9 23:34 
./00/16/0016a71e-89cf-4a86-9575-49c7e9d216c6
---------T 2 gmathur users 0 Jan 9 23:47 
./00/16/00168aa2-d069-4a77-8790-e36431324ca5
---------T 2 bangell users 0 Jan 22 09:24 
./00/16/0016e720-a190-4e43-962f-aa3e4216e5f5
---------T 2 root root 0 Jan 22 09:26 
./00/16/00169e95-64b7-455c-82dc-d9940ee7fe43
---------T 2 dfrobins users 0 Jan 22 09:27 
./00/16/00161b04-1612-4fba-99a4-2a2b54062fdb
---------T 2 mdick users 0 Jan 22 09:27 
./00/16/0016ba60-310a-4bee-968a-36eb290e8c9e
---------T 2 dfrobins users 0 Jan 22 09:43 
./00/16/00160315-1533-4290-8c1a-72e2fbb1962a
From test2brick:
find . -type f -size 0 -perm 1000 -exec ls -al {} \;

------ Original Message ------
From: "Justin Clift" <justin@xxxxxxxxxxx>
To: "David F. Robinson" <david.robinson@xxxxxxxxxxxxx>
Sent: 2/9/2015 11:33:54 PM
Subject: Re:  missing files

Interesting. (I'm 1/2 asleep atm and really need sleep soon, so take 
this
with a grain of salt... ;>)

As a curiosity question, does the homegfs_bkp volume have a bunch of
outdated metadata still in it? eg left over extended attributes or 
something

Remembering a question you asked earlier er... today/yesterday about 
old
extended attribute entries and if they hang around forever. I don't 
know the
answer to that, but if the old volume still has a 1000's (or more) of 
entries
around, perhaps there's some lookup problem that's killing lookup times 
for
file operations.

On a side note, I can probably setup my test lab stuff here again 
tomorrow
and try this stuff out myself to see if I can replicate the problem. 
(if that
could potentially be useful?)

+ Justin

On 9 Feb 2015, at 22:56, David F. Robinson 
<david.robinson@xxxxxxxxxxxxx> wrote:
 Justin,

 Hoping you can help point this to the right people once again. Maybe 
all of these issues are related.

 You can look at the email traffic below, but the summary is that I 
was working with Ben to figure out why my GFS system was 20x slower 
than my old storage system. During my tracing of this issue, I 
determined that if I create a new volume on my storage system, this 
slowness goes away. So, either it is faster because it doesn't have 
any data on this new volume (I hope this isn't the case) or the older 
partitions somehow became corrupted during the upgrades or has some 
depricated parameters set that slow it down.

 Very strange and hoping you can once again help... Thanks in 
advance...

 David

 ------ Forwarded Message ------
 From: "David F. Robinson" <david.robinson@xxxxxxxxxxxxx>
 To: "Benjamin Turner" <bennyturns@xxxxxxxxx>
 Sent: 2/9/2015 5:52:00 PM
 Subject: Re[5]:  missing files

 Ben,

 I cleared the logs and rebooted the machine. Same issue. homegfs_bkp 
takes 19-minutes and test2brick (the new volume) takes 1-minute.

 Is it possible that some old parameters are still set for homegfs_bkp 
that are no longer in use? I tried a gluster volume reset for 
homegfs_bkp, but it didn't have any effect.

 I have attached the full logs.

 David

 ------ Original Message ------
 From: "David F. Robinson" <david.robinson@xxxxxxxxxxxxx>
 To: "Benjamin Turner" <bennyturns@xxxxxxxxx>
 Sent: 2/9/2015 5:39:18 PM
 Subject: Re[4]:  missing files

 Ben,

 I have traced this out to a point where I can rule out many issues. 
I was hoping you could help me from here.
 I went with the "tar -xPf boost.tar" as my test case, which on my 
old storage system took about 1-minute to extract. On my backup 
system and my primary storage (both gluster), it takes roughly 
19-minutes.

 First step was to create a new storage system (striped RAID, two 
sets of 3-drives). All was good here with a gluster extraction time 
of 1-minute. I then went to my backup system and created another 
partition using only one of the two bricks on that system. Still 
1-minute. I went to a two brick setup and it stayed at 1-minute.

 At this point, I have recreated using the same parameters on a 
test2brick volume that should be identical to my homegfs_bkp volume. 
Everything is the same including how I mounted the volume. The only 
different is that the homegfs_bkp has 30-TB of data and the 
test2brick is blank. I didn't think that performance would be 
affected by putting data on the volume.

 Can you help? Do you have any suggestions? Do you think upgrading 
gluster from 3.5 to 3.6.1 to 3.6.2 somehow message up homegfs_bkp? My 
layout is shown below. These should give identical speeds.

 [root@gfs01bkp test2brick]# gluster volume info homegfs_bkp
 Volume Name: homegfs_bkp
 Type: Distribute
 Volume ID: 96de8872-d957-4205-bf5a-076e3f35b294
 Status: Started
 Number of Bricks: 2
 Transport-type: tcp
 Bricks:
 Brick1: gfsib01bkp.corvidtec.com:/data/brick01bkp/homegfs_bkp
 Brick2: gfsib01bkp.corvidtec.com:/data/brick02bkp/homegfs_bkp
 [root@gfs01bkp test2brick]# gluster volume info test2brick

 Volume Name: test2brick
 Type: Distribute
 Volume ID: 123259b2-3c61-4277-a7e8-27c7ec15e550
 Status: Started
 Number of Bricks: 2
 Transport-type: tcp
 Bricks:
 Brick1: gfsib01bkp.corvidtec.com:/data/brick01bkp/test2brick
 Brick2: gfsib01bkp.corvidtec.com:/data/brick02bkp/test2brick

 [root@gfs01bkp brick02bkp]# mount | grep test2brick
 gfsib01bkp.corvidtec.com:/test2brick.tcp on /test2brick type 
fuse.glusterfs (rw,allow_other,max_read=131072)
 [root@gfs01bkp brick02bkp]# mount | grep homegfs_bkp
 gfsib01bkp.corvidtec.com:/homegfs_bkp.tcp on /backup/homegfs type 
fuse.glusterfs (rw,allow_other,max_read=131072)

 [root@gfs01bkp brick02bkp]# df -h
 Filesystem Size Used Avail Use% Mounted on
 /dev/mapper/vg00-lv_root 20G 1.7G 18G 9% /
 tmpfs 16G 0 16G 0% /dev/shm
 /dev/md126p1 1008M 110M 848M 12% /boot
 /dev/mapper/vg00-lv_opt 5.0G 220M 4.5G 5% /opt
 /dev/mapper/vg00-lv_tmp 5.0G 139M 4.6G 3% /tmp
 /dev/mapper/vg00-lv_usr 20G 2.7G 17G 15% /usr
 /dev/mapper/vg00-lv_var 40G 4.4G 34G 12% /var
 /dev/mapper/vg01-lvol1 88T 22T 67T 25% /data/brick01bkp
 /dev/mapper/vg02-lvol1 88T 22T 67T 25% /data/brick02bkp
 gfsib01bkp.corvidtec.com:/homegfs_bkp.tcp 175T 43T 133T 25% 
/backup/homegfs
 gfsib01bkp.corvidtec.com:/test2brick.tcp 175T 43T 133T 25% 
/test2brick

 ------ Original Message ------
 From: "Benjamin Turner" <bennyturns@xxxxxxxxx>
 To: "David F. Robinson" <david.robinson@xxxxxxxxxxxxx>
 Sent: 2/6/2015 12:52:58 PM
 Subject: Re: Re[2]:  missing files

 Hi David. Lets start with the basics and go from there. IIRC you 
are using LVM with thick provisioning, lets verify the following:

 1. You have everything properly aligned for your RAID stripe size, 
etc. I have attached the script we package with RHS that I am in the 
process of updating. I want to double check you created the PV / VG 
/ LV with the proper variables. Have a look at the create_pv, 
create_vg, and create_lv(old) functions. You will need to know the 
stripe size of your raid and the number of stripe elements(data 
disks, not hotspares). Also make sure you mkfs.xfs with:

 echo "mkfs -t xfs -f -K -i size=$inode_size -d 
sw=$stripe_elements,su=$stripesize -n size=$fs_block_size 
/dev/$vgname/$lvname"

 We use 512k inodes because some workload use more than the default 
inode size and you don't want xattrs bleeding over inodes.

 2. Are you running RHEL or Centos? If so I would recommend 
tuned_profile=rhs-high-throughput. If you don't have that tuned 
profile I'll get you everything it sets.

 3. For small files we we recommend the following:

 # RAID related variables.
 # stripesize - RAID controller stripe unit size
 # stripe_elements - the number of data disks
 # The --dataalignment option is used while creating the physical 
volumeTo
 # align I/O at LVM layer
 # dataalign -
 # RAID6 is recommended when the workload has predominantly larger
 # files ie not in kilobytes.
 # For RAID6 with 12 disks and 128K stripe element size.
 stripesize=128k
 stripe_elements=10
 dataalign=1280k

 # RAID10 is recommended when the workload has predominantly smaller 
files
 # i.e in kilobytes.
 # For RAID10 with 12 disks and 256K stripe element size, uncomment 
the
 # lines below.
 # stripesize=256k
 # stripe_elements=6
 # dataalign=1536k

 4. Jumbo frames everywhere! Check out the effect of jumbo frames, 
make sure they are setup properly on your switch and add the 
MTU=9000 to your ifcfg files(unless you have it already):

https://rhsummit.files.wordpress.com/2013/07/england_th_0450_rhs_perf_practices-4_neependra.pdf 
(see the jumbo frames section here, the whole thing is a good read)

https://rhsummit.files.wordpress.com/2014/04/bengland_h_1100_rhs_performance.pdf 
(this is updated for 2014)

 5. There is a smallfile enhancement that just landed in master that 
is showing me a 60% improvement in writes. This is called multi 
threaded epoll and it is looking VERY promising WRT smallfile 
performance. Here is a summary:

 Hi all. I see alot of discussion on $subject and I wanted to take a 
minute to talk about it and what we can do to test / observe the 
effects of it. Lets start with a bit of background:

 **Background**

 -Currently epoll is single threaded on both clients and servers.
   *This leads to a "hot thread" which consumes 100% of a CPU core.
   *This can be observed by running BenE's smallfile benchmark to 
create files, running top(on both clients and servers), and pressing 
H to show threads.
   *You will be able to see a single glusterfs thread eating 100% of 
the CPU:

  2871 root 20 0 746m 24m 3004 S 100.0 0.1 14:35.89 glusterfsd
  4522 root 20 0 747m 24m 3004 S 5.3 0.1 0:02.25 glusterfsd
  4507 root 20 0 747m 24m 3004 S 5.0 0.1 0:05.91 glusterfsd
 21200 root 20 0 747m 24m 3004 S 4.6 0.1 0:21.16 glusterfsd

 -Single threaded epoll is a bottlenck for high IOP / low metadata 
workloads(think smallfile). With single threaded epoll we are CPU 
bound by the single thread pegging out a CPU.

 So the proposed solution to this problem is to make epoll multi 
threaded on both servers and clients. Here is a link to the upstream 
proposal:

http://www.gluster.org/community/documentation/index.php/Features/Feature_Smallfile_Perf#multi-thread-epoll

 Status: [ http://review.gluster.org/#/c/3842/ based on Anand 
Avati's patch ]

 Why: remove single-thread-per-brick barrier to higher CPU 
utilization by servers

 Use case: multi-client and multi-thread applications

 Improvement: measured 40% with 2 epoll threads and 100% with 4 
epoll threads for small file creates to an SSD

 Disadvantage: conflicts with support for SSL sockets, may require 
significant code change to support both.

 Note: this enhancement also helps high-IOPS applications such as 
databases and virtualization which are not metadata-intensive. This 
has been measured already using a Fusion I/O SSD performing random 
reads and writes -- it was necessary to define multiple bricks per 
SSD device to get Gluster to the same order of magnitude IOPS as a 
local filesystem. But this workaround is problematic for users, 
because storage space is not properly measured when there are 
multiple bricks on the same filesystem.

 Multi threaded epoll is part of a larger page that talks about 
smallfile performance enhancements, proposed and happening.

 Goal: if successful, throughput bottleneck should be either the 
network or the brick filesystem!
 What it doesn't do: multi-thread-epoll does not solve the 
excessive-round-trip protocol problems that Gluster has.
 What it should do: allow Gluster to exploit the mostly untapped CPU 
resources on the Gluster servers and clients.
 How it does it: allow multiple threads to read protocol messages 
and process them at the same time.
 How to observe: multi-thread-epoll should be configurable (how to 
configure? gluster command?), with thread count 1 it should be same 
as RHS 3.0, with thread count 2-4 it should show significantly more 
CPU utilization (threads visible with "top -H"), resulting in higher 
throughput.

 **How to observe**

 Here are the commands needed to setup an environment to test in on 
RHS 3.0.3:
 rpm -e glusterfs-api glusterfs glusterfs-libs glusterfs-fuse 
glusterfs-geo-replication glusterfs-rdma glusterfs-server 
glusterfs-cli gluster-nagios-common samba-glusterfs vdsm-gluster 
--nodeps
 rhn_register
 yum groupinstall "Development tools"
 git clone https://github.com/gluster/glusterfs.git
 git branch test
 git checkout test
 git fetch http://review.gluster.org/glusterfs 
refs/changes/42/3842/17 && git cherry-pick FETCH_HEAD
 git fetch http://review.gluster.org/glusterfs 
refs/changes/88/9488/2 && git cherry-pick FETCH_HEAD
 yum install openssl openssl-devel
 wget 
ftp://fr2.rpmfind.net/linux/epel/6/x86_64/cmockery2-1.3.8-2.el6.x86_64.rpm
 wget 
ftp://fr2.rpmfind.net/linux/epel/6/x86_64/cmockery2-devel-1.3.8-2.el6.x86_64.rpm
 yum install cmockery2-1.3.8-2.el6.x86_64.rpm 
cmockery2-devel-1.3.8-2.el6.x86_64.rpm libxml2-devel
 ./autogen.sh
 ./configure
 make
 make install

 Verify you are using the upstream with:

 # gluster -- version

 To enable set multithreaded epoll run the following commands:

 From the patch:
         { .key = "client.event-threads", 839
           .voltype = "protocol/client", 840
           .op_version = GD_OP_VERSION_3_7_0, 841
                                 },
         { .key = "server.event-threads", 946
           .voltype = "protocol/server", 947
           .op_version = GD_OP_VERSION_3_7_0, 948
         },

 # gluster v set <volname> server.event-threads 4
 # gluster v set <volname> client.event-threads 4

 Also grab smallfile:

 https://github.com/bengland2/smallfile

 After git cloneing smallfile run:

 python /small-files/smallfile/smallfile_cli.py --operation create 
--threads 8 --file-size 64 --files 10000 --top /gluster-mount 
--pause 1000 --host-set "client1 client2"

 Again we will be looking at top + show threads(press H). With 4 
threads on both clients and servers you should see something similar 
to(this isnt exact, I coped and pasted):

  2871 root 20 0 746m 24m 3004 S 35.0 0.1 14:35.89 glusterfsd
  2872 root 20 0 746m 24m 3004 S 51.0 0.1 14:35.89 glusterfsd
  2873 root 20 0 746m 24m 3004 S 43.0 0.1 14:35.89 glusterfsd
  2874 root 20 0 746m 24m 3004 S 65.0 0.1 14:35.89 glusterfsd
  4522 root 20 0 747m 24m 3004 S 5.3 0.1 0:02.25 glusterfsd
  4507 root 20 0 747m 24m 3004 S 5.0 0.1 0:05.91 glusterfsd
 21200 root 20 0 747m 24m 3004 S 4.6 0.1 0:21.16 glusterfsd

 If you have a test env I would be interested to see how multi 
threaded epoll performs, but I am 100% sure its not ready for 
production yet. RH will be supporting it with our 3.0.4(the next 
one) release unless we find show stopping bugs. My testing looks 
very promising though.

 Smallfile performance enhancements are one of the key focuses for 
our 3.1 release this summer, we are working very hard to improve 
this as this is the use case for the majority of people.

 On Fri, Feb 6, 2015 at 11:59 AM, David F. Robinson 
<david.robinson@xxxxxxxxxxxxx> wrote:
 Ben,

 I was hoping you might be able to help with two performance 
questions. I was doing some testing of my rsync where I am backing 
up my primary gluster system (distributed + replicated) to my backup 
gluster system (distributed). I tried three tests where I rsynced 
from one of my primary sytems (gfsib02b) to my backup machine. The 
test directory contains roughly 5500 files, most of which are small. 
The script I ran is shown below which repeats the tests 3x for each 
section to check variability in timing.

 1) Writing to the local disk is drastically faster than writing to 
gluster. So, my writes to the backup gluster system are what is 
slowing me down, which makes sense.
 2) When I write to the backup gluster system (/backup/homegfs), the 
timing goes from 35-seconds to 1min40seconds. The question here is 
whether you could recommend any settings for this volume that would 
improve performance for small file writes? I have included the 
output of 'gluster volume info" below.
 3) When I did the same tests on the Source_bkp volume, it is almost 
3x as slow as the homegfs_bkp volume. However, these are just 
different volumes on the same storage system. The volume parameters 
are identical (see below). The performance of these two should be 
identical. Any idea why they wouldn't be? And any suggestions for 
how to fix this? The only thing that I see different between the two 
is the order of the "Options reconfigured" section. I assume order 
of options doesn't matter.

 Backup to local hard disk (no gluster writes)
  time /usr/local/bin/rsync -av --numeric-ids --delete 
--block-size=131072 -e "ssh -T -c arcfour -o Compression=no -x" 
gfsib02b:/homegfs/test /temp1
  time /usr/local/bin/rsync -av --numeric-ids --delete 
--block-size=131072 -e "ssh -T -c arcfour -o Compression=no -x" 
gfsib02b:/homegfs/test /temp2
  time /usr/local/bin/rsync -av --numeric-ids --delete 
--block-size=131072 -e "ssh -T -c arcfour -o Compression=no -x" 
gfsib02b:/homegfs/test /temp3

         real 0m35.579s
         user 0m31.290s
         sys 0m12.282s

         real 0m38.035s
         user 0m31.622s
         sys 0m10.907s
         real 0m38.313s
         user 0m31.458s
         sys 0m10.891s
 Backup to gluster backup system on volume homegfs_bkp
  time /usr/local/bin/rsync -av --numeric-ids --delete 
--block-size=131072 -e "ssh -T -c arcfour -o Compression=no -x" 
gfsib02b:/homegfs/test /backup/homegfs/temp1
  time /usr/local/bin/rsync -av --numeric-ids --delete 
--block-size=131072 -e "ssh -T -c arcfour -o Compression=no -x" 
gfsib02b:/homegfs/test /backup/homegfs/temp2
  time /usr/local/bin/rsync -av --numeric-ids --delete 
--block-size=131072 -e "ssh -T -c arcfour -o Compression=no -x" 
gfsib02b:/homegfs/test /backup/homegfs/temp3

         real 1m42.026s
         user 0m32.604s
         sys 0m9.967s

         real 1m45.480s
         user 0m32.577s
         sys 0m11.994s

         real 1m40.436s
         user 0m32.521s
         sys 0m11.240s

 Backup to gluster backup system on volume Source_bkp
  time /usr/local/bin/rsync -av --numeric-ids --delete 
--block-size=131072 -e "ssh -T -c arcfour -o Compression=no -x" 
gfsib02b:/homegfs/test /backup/Source/temp1
  time /usr/local/bin/rsync -av --numeric-ids --delete 
--block-size=131072 -e "ssh -T -c arcfour -o Compression=no -x" 
gfsib02b:/homegfs/test /backup/Source/temp2
  time /usr/local/bin/rsync -av --numeric-ids --delete 
--block-size=131072 -e "ssh -T -c arcfour -o Compression=no -x" 
gfsib02b:/homegfs/test /backup/Source/temp3

         real 3m30.491s
         user 0m32.676s
         sys 0m10.776s

         real 3m26.076s
         user 0m32.588s
         sys 0m11.048s
         real 3m7.460s
         user 0m32.763s
         sys 0m11.687s

 Volume Name: Source_bkp
 Type: Distribute
 Volume ID: 1d4c210d-a731-4d39-a0c5-ea0546592c1d
 Status: Started
 Number of Bricks: 2
 Transport-type: tcp
 Bricks:
 Brick1: gfsib01bkp.corvidtec.com:/data/brick01bkp/Source_bkp
 Brick2: gfsib01bkp.corvidtec.com:/data/brick02bkp/Source_bkp
 Options Reconfigured:
 performance.cache-size: 128MB
 performance.io-thread-count: 32
 server.allow-insecure: on
 network.ping-timeout: 10
 storage.owner-gid: 100
 performance.write-behind-window-size: 128MB
 server.manage-gids: on
 changelog.rollover-time: 15
 changelog.fsync-interval: 3

 Volume Name: homegfs_bkp
 Type: Distribute
 Volume ID: 96de8872-d957-4205-bf5a-076e3f35b294
 Status: Started
 Number of Bricks: 2
 Transport-type: tcp
 Bricks:
 Brick1: gfsib01bkp.corvidtec.com:/data/brick01bkp/homegfs_bkp
 Brick2: gfsib01bkp.corvidtec.com:/data/brick02bkp/homegfs_bkp
 Options Reconfigured:
 storage.owner-gid: 100
 performance.io-thread-count: 32
 server.allow-insecure: on
 network.ping-timeout: 10
 performance.cache-size: 128MB
 performance.write-behind-window-size: 128MB
 server.manage-gids: on
 changelog.rollover-time: 15
 changelog.fsync-interval: 3

 ------ Original Message ------
 From: "Benjamin Turner" <bennyturns@xxxxxxxxx>
 To: "David F. Robinson" <david.robinson@xxxxxxxxxxxxx>
 Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx>; 
"gluster-users@xxxxxxxxxxx" <gluster-users@xxxxxxxxxxx>
 Sent: 2/3/2015 7:12:34 PM
 Subject: Re:  missing files

 It sounds to me like the files were only copied to one replica, 
werent there for the initial for the initial ls which triggered a 
self heal, and were there for the last ls because they were healed. 
Is there any chance that one of the replicas was down during the 
rsync? It could be that you lost a brick during copy or something 
like that. To confirm I would look for disconnects in the brick 
logs as well as checking glusterfshd.log to verify the missing 
files were actually healed.

 -b

 On Tue, Feb 3, 2015 at 5:37 PM, David F. Robinson 
<david.robinson@xxxxxxxxxxxxx> wrote:
 I rsync'd 20-TB over to my gluster system and noticed that I had 
some directories missing even though the rsync completed normally.
 The rsync logs showed that the missing files were transferred.

 I went to the bricks and did an 'ls -al 
/data/brick*/homegfs/dir/*' the files were on the bricks. After I 
did this 'ls', the files then showed up on the FUSE mounts.

 1) Why are the files hidden on the fuse mount?
 2) Why does the ls make them show up on the FUSE mount?
 3) How can I prevent this from happening again?

 Note, I also mounted the gluster volume using NFS and saw the same 
behavior. The files/directories were not shown until I did the "ls" 
on the bricks.

 David

 ===============================
 David F. Robinson, Ph.D.
 President - Corvid Technologies
 704.799.6944 x101 [office]
 704.252.1310 [cell]
 704.799.7974 [fax]
 David.Robinson@xxxxxxxxxxxxx
 http://www.corvidtechnologies.com/

 _______________________________________________
 Gluster-devel mailing list
 Gluster-devel@xxxxxxxxxxx
 http://www.gluster.org/mailman/listinfo/gluster-devel

 <glusterfs.tgz>

--
GlusterFS - http://www.gluster.org

An open source, distributed file system scaling to several
petabytes, and handling thousands of clients.

My personal twitter: twitter.com/realjustinclift
Attachment:
glusterfs.tgz

Description: application/compressed
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel