Maybe we could try something to confirm / deny my theory.
What about asking rsync to ignore anything that could
differ between bricks in a replicated pair. A couple
options I see are:
--size-only means that rsync will skip files that match
in size, even if the timestamps differ. This means it will
synchronise less files than the default behaviour. It will
miss any file with changes that don't affect the overall
file size.
--ignore-times means that rsync will checksum every file,
even if the timestamps and file sizes match. This means it
will synchronise more files than the default behaviour. It
will include changes to files even where the file size is
the same and the modification date/time has been reset to
the original value (resetting the date/time is unlikely to
be done in practise, but it could happen).
These may also help, but it looks more to be for
recovering from brick failures:
I'll try some stuff in the lab and see if I can come up
with RCA or something that helps.
------ Original Message ------
Sent: 4/27/2015 4:52:35 PM
Subject: Re: Disastrous performance
with rsync to
mounted Gluster volume.
>----- Original Message -----
>> Sent: Monday, April 27, 2015 4:24:56 PM
>> Subject: Re: Disastrous
performance with rsync to
>>mounted Gluster volume.
>>
>> On 2015-04-24 11:43, Joe Julian wrote:
>>
>> >> This should get you where you need
to be. Before you start to
>>migrate
>> >> the data maybe do a couple DDs and
send me the output so we can
>>get an
>> >> idea of how your cluster performs:
>> >>
>> >> time `dd if=/dev/zero
of=<gluster-mount>/myfile bs=1024k
>>count=1000;
>> >> sync`
>> >> echo 3 >
/proc/sys/vm/drop_caches
>> >> dd if=<gluster mount>
of=/dev/null bs=1024k count=1000
>> >>
>> >> If you are using gigabit and
glusterfs mounts with replica 2 you
>> >> should get ~55 MB / sec writes and
~110 MB / sec reads. With NFS
>>you
>> >> will take a bit of a hit since NFS
doesnt know where files live
>>like
>> >> glusterfs does.
>>
>> After copying our data and doing a couple of
very slow rsyncs, I did
>> your speed test and came back with these
results:
>>
>> 1048576 bytes (1.0 MB) copied, 0.0307951 s,
34.1 MB/s
>> root@backup:/home/webmailbak# dd
if=/dev/zero of=/mnt/testfile
>> count=1024 bs=1024; sync
>> 1024+0 records in
>> 1024+0 records out
>> 1048576 bytes (1.0 MB) copied, 0.0298592 s,
35.1 MB/s
>> root@backup:/home/webmailbak# dd
if=/dev/zero of=/mnt/testfile
>> count=1024 bs=1024; sync
>> 1024+0 records in
>> 1024+0 records out
>> 1048576 bytes (1.0 MB) copied, 0.0501495 s,
20.9 MB/s
>> root@backup:/home/webmailbak# echo 3 >
/proc/sys/vm/drop_caches
>> root@backup:/home/webmailbak# # dd
if=/mnt/testfile of=/dev/null
>> bs=1024k count=1000
>> 1+0 records in
>> 1+0 records out
>> 1048576 bytes (1.0 MB) copied, 0.0124498 s,
84.2 MB/s
>>
>>
>> Keep in mind that this is an NFS share over
the network.
>>
>> I've also noticed that if I increase the
count of those writes, the
>> transfer speed increases as well:
>>
>> 2097152 bytes (2.1 MB) copied, 0.036291 s,
57.8 MB/s
>> root@backup:/home/webmailbak# dd
if=/dev/zero of=/mnt/testfile
>> count=2048 bs=1024; sync
>> 2048+0 records in
>> 2048+0 records out
>> 2097152 bytes (2.1 MB) copied, 0.0362724 s,
57.8 MB/s
>> root@backup:/home/webmailbak# dd
if=/dev/zero of=/mnt/testfile
>> count=2048 bs=1024; sync
>> 2048+0 records in
>> 2048+0 records out
>> 2097152 bytes (2.1 MB) copied, 0.0360319 s,
58.2 MB/s
>> root@backup:/home/webmailbak# dd
if=/dev/zero of=/mnt/testfile
>> count=10240 bs=1024; sync
>> 10240+0 records in
>> 10240+0 records out
>> 10485760 bytes (10 MB) copied, 0.127219 s,
82.4 MB/s
>> root@backup:/home/webmailbak# dd
if=/dev/zero of=/mnt/testfile
>> count=10240 bs=1024; sync
>> 10240+0 records in
>> 10240+0 records out
>> 10485760 bytes (10 MB) copied, 0.128671 s,
81.5 MB/s
>
>This is correct, there is overhead that happens
with small files and
>the smaller the file the less throughput you get.
That said, since
>files are smaller you should get more files /
second but less MB /
>second. I have found that when you go under 16k
changing files size
>doesn't matter, you will get the same number of
16k files / second as
>you do 1 k files.
>
>>
>>
>> However, the biggest stumbling block for
rsync seems to be changes to
>> directories. I'm unsure about what exactly
it's doing (probably
>>changing
>> last access times?) but these minor writes
seem to take a very long
>>time
>> when normally they would not. Actual file
copies (as in the very
>>files
>> that are actually new within those same
directories) appear to take
>> quite a lot less time than the directory
updates.
>
>Dragons be here! Access time is not kept in sync
across the
>replicas(IIRC, someone correct me if I am wrong!)
and each time a dir
>is read from a different brick I bet the access
time is different.
>
>>
>> For example:
>>
>> # time rsync -av --inplace --whole-file
--ignore-existing
>>--delete-after
>> gromm/* /mnt/gromm/
>> building file list ... done
>> Maildir/ ## This part
takes a long time.
>> Maildir/.INBOX.Trash/
>> Maildir/.INBOX.Trash/cur/
>>
>>Maildir/.INBOX.Trash/cur/1429836077.H817602P21531.pop.lightspeed.ca:2,S
>> Maildir/.INBOX.Trash/tmp/ ## The
previous three lines took
>>nearly
>> no time at all.
>> Maildir/cur/ ## This
takes a long time.
>>
Maildir/cur/1430160436.H952679P13870.pop.lightspeed.ca:2,S
>> Maildir/new/
>> Maildir/tmp/ ## The
previous lines again take no
>>time
>> at all.
>> deleting
Maildir/cur/1429836077.H817602P21531.pop.lightspeed.ca:2,S
>> ## This delete did take a while.
>> sent 1327634 bytes received 75 bytes
59009.29 bytes/sec
>> total size is 624491648 speedup is 470.35
>>
>> real 0m26.110s
>> user 0m0.140s
>> sys 0m1.596s
>>
>>
>> So, rsync reports that it wrote 1327634
bytes at 59 kBytes/sec, and
>>the
>> whole operation took 26 seconds. To write 2
files that were around
>>20-30
>> kBytes each and delete 1.
>>
>> The last rsync took around 56 minutes, when
normally such an rsync
>>would
>> have taken 5-10 minutes, writing over the
network via ssh.
>
>It may have something to do with the access times
not being in sync
>across replicated pairs. Maybe some has
experience with this / could
>this be tripping up rsync?
>
>-b
>
>>
_______________________________________________
>> Gluster-users mailing list
>>
>_______________________________________________
>Gluster-users mailing list