Re: network copy performance is poor (rsync) - debugging suggestions?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



On 28-01-2015 21:39, Gordon Messmer wrote:
On 01/23/2015 01:44 AM, Götz Reinicke - IT Koordinator wrote:
I do have two centos 6.6 servers. With a "performance optimized" rsync I
get an speed of 15 - 20 MB/s

That *is* pretty slow for sustained writes.  Does the same rate hold
true for individual large files as it does for lots of small ones?  What
filesystem are you using on each side?

rsync -aHAXxv --numeric-ids --progress -e "ssh -T -c arcfour -o
Compression=no -x"

It's worth noting that -X and -A are going to perform filesystem IO that
you don't see on SMB, because it isn't going to preserve/set ACLs and
extended attributes (IIRC).  So, one possibility is that you're seeing a
difference in rate because you're doing lots of small files and
filesystem operations are relatively slow.

You might drop those two options and see how that affects the rate.  If
you determine that those are the cause of the performance difference,
you can turn them back on, understanding that there's a cost associated
with preserving that data.

Both servers have plenty of memory and cpu usage looks low.

Define low.  If you're using top and press '1' to expand the CPU lines,
you'll probably see one cpu with higher "us" percentage, which is SSH
encrypting the data.  What percentage is that?  Is there a large value
in "sy" or "hi" on any CPU?  Probably not since you see good rates using
'dd' and smb copies, but I've seen systems where interrupt processing
was a major bottleneck, so I make it a standard check.

+1 on all above.

Also, it's likely that your ssh process is going to limit the transfer. Yet, if you remotely mount the share (cifs/nfs) and do rsync on top of it, it may give you line rate but also may end up transferring data that wouldn't have to be transferred (specially if you use rsync -c option). It will also transfer over network millions syscalls for reading mtime's and if your sync was going to transfer just 5% of total payload, it may take longer then do it via ssh.

If that's the case (ssh limiting), I would simply consider splitting this process into several rsyncs. Spawn one for each subdir, for example (and maybe limit to 4, 8 simultaneous processes). That should scale well if your storage doesn't complain. It's an easy shell script to write and that's what rsync ends up doing anyway.

Never used, just found it when searching for "parallel rsync": http://moo.nac.uci.edu/~hjm/parsync/ may be useful.

  Marcelo

_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos





[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux