Very slow rsync to gluster volume UNLESS `ls` or `find` scan dir on gluster volume first

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I have been working on setting up a 4 replica gluster with over a million files (~250GB total), and I've seen some really weird stuff happen, even after trying to optimize for small files. I've set up a 4-brick replicate volume (gluster 3.13.2).

It took almost 2 days to rsync the data from the local drive to the gluster volume, and now I'm running a 2nd rsync that just looks for changes in case more files have been written. I'd like to concentrate this email on a very specific and odd issue.


The dir structure is
YYYY/
          MM/
                 10k+files in each month folder


rsyncing each month folder cold can take 2+ minutes.

However, if I ls the destination folder first, or use find (both of which finish within 5 seconds), the rsync is almost instant.


Here's a log with time calls that shows you what happens.:

box:/mnt/gluster/uploads/2017 # time rsync -aPr /srv/www/htdocs/uploads/2017/08/ 08/ 
sending incremental file list
^Crsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(637) [sender=3.1.0]

real    1m39.848s
user    0m0.010s
sys     0m0.030s
box:/mnt/gluster/uploads/2017 # time find 08 | wc -l  
14254

real    0m0.726s
user    0m0.013s
sys     0m0.033s
box:/mnt/gluster/uploads/2017 # time rsync -aPr /srv/www/htdocs/uploads/2017/08/ 08/
sending incremental file list

real    0m0.562s
user    0m0.057s
sys     0m0.137s
box:/mnt/gluster/uploads/2017 # time find 07 | wc -l 
10103

real    0m4.550s
user    0m0.010s
sys     0m0.033s
box:/mnt/gluster/uploads/2017 # time rsync -aPr /srv/www/htdocs/uploads/2017/07/ 07/ 
sending incremental file list

real    0m0.428s
user    0m0.030s
sys     0m0.083s
box:/mnt/gluster/uploads/2017 # time ls 06 | wc -l       
11890

real    0m1.850s
user    0m0.077s
sys     0m0.040s
box:/mnt/gluster/uploads/2017 # time rsync -aPr /srv/www/htdocs/uploads/2017/06/ 06/ 
sending incremental file list

real    0m0.627s
user    0m0.073s
sys     0m0.107s
box:/mnt/gluster/uploads/2017 # time rsync -aPr /srv/www/htdocs/uploads/2017/05/ 05/ 
sending incremental file list

real    2m24.382s
user    0m0.127s
sys     0m0.357s

Note how if I precede the rsync call with ls or find, the rsync completes in less than a second (finding no files to sync because they've already been synced). Otherwise, it takes over 2 minutes (I interrupted the first call before the 2 minutes because it was already taking too long).

What could be causing rsync to work so slowly unless the dir is primed?

Volume config:
Volume Name: gluster
Type: Replicate
Volume ID: XXXXXXXXXXXXXXXXXXXXXXXXX
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 4 = 4
Transport-type: tcp
Bricks:
Brick1: server1 :/mnt/server1_block4/gluster
Brick2: server2 :/mnt/server2_block4/gluster
Brick3: server3 :/mnt/server3_block4/gluster
Brick4: server4 :/mnt/server4_block4/gluster
Options Reconfigured:
performance.parallel-readdir: off
transport.address-family: inet
nfs.disable: on
cluster.self-heal-daemon: enable
performance.cache-size: 1GB
network.ping-timeout: 5
cluster.quorum-type: fixed
cluster.quorum-count: 1
features.cache-invalidation: on
features.cache-invalidation-timeout: 600
performance.cache-invalidation: on
performance.md-cache-timeout: 600
network.inode-lru-limit: 500000
performance.rda-cache-limit: 256MB
performance.read-ahead: off
client.event-threads: 4
server.event-threads: 4


Thank you for any insight.

Sincerely,
Artem
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux