With a long distance link I would definitely look into switching to BBR for your congestion control as your first step. Well, your _first_ step is to do an iperf and establish a baseline.... A quick search and this link seems to explain it not-too-bad https://www.cyberciti.biz/cloud-computing/increase-your-linux-server-internet-speed-with-tcp-bbr-congestion-control/ We have used it before with great success for long distance, high throughput transfers. -paul -- Paul Mezzanini Sr Systems Administrator / Engineer, Research Computing Information & Technology Services Finance & Administration Rochester Institute of Technology o:(585) 475-3245 | pfmeec@xxxxxxx CONFIDENTIALITY NOTE: The information transmitted, including attachments, is intended only for the person(s) or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and destroy any copies of this information. ------------------------ ________________________________________ From: Nicolas Moal <nicolas.moal@xxxxxxxxxxx> Sent: Thursday, October 8, 2020 10:36 AM To: ceph-users Subject: Multisite replication speed Hello everybody, We have two Ceph object clusters replicating over a very long-distance WAN link. Our version of Ceph is 14.2.10. Currently, replication speed seems to be capped around 70 MiB/s even if there's a 10Gb WAN link between the two clusters. The clusters themselves don't seem to suffer from any performance issue. The replication traffic leverages HAProxy VIPs, which means there's a single endpoint (the HAProxy VIP) in the multisite replication configuration. So, my questions are: - Is it possible to improve replication speed by adding more endpoints in the multisite replication configuration? The issue we are facing is that the secondary cluster is way behind the master cluster because of the relatively slow speed. - Is there anything else I can do to optimize replication speed ? Thanks for your comments ! Nicolas _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx