FWIW when using rbd-mirror to migrate volumes between SATA SSD clusters, I found that rbd_mirror_journal_max_fetch_bytes: section: "client" value: "33554432" rbd_journal_max_payload_bytes: section: "client" value: “8388608" Made a world of difference in expediting journal reply on Luminous 12.2.2. With defaults, some active voumes would take hours to converge, and a couple were falling even more behind. This was mirroring 1 to 2 volumes at a time. YMMV. > On Mar 10, 2020, at 7:36 AM, Ml Ml <mliebherr99@xxxxxxxxxxxxxx> wrote: > > Hello Jason, > > thanks for that fast reply. > > This is now my /etc/ceph/ceph.conf > > [client] > rbd_mirror_journal_max_fetch_bytes = 4194304 > > > I stopped and started my rbd-mirror manually with: > rbd-mirror -d -c /etc/ceph/ceph.conf > > Still same result. Slow speed shown by iftop and entries_behind_master > keeps increasing a lot if i produce 20MB/sec traffic on that > replication image. > > The latency is like: > --- 10.10.50.1 ping statistics --- > 100 packets transmitted, 100 received, 0% packet loss, time 20199ms > rtt min/avg/max/mdev = 0.067/0.286/1.418/0.215 ms > > iperf from the source node to the destination node (where the > rbd-mirror runs): 8.92 Gbits/sec > > Any other idea? > > Thanks, > Michael > > > > On Tue, Mar 10, 2020 at 2:19 PM Jason Dillaman <jdillama@xxxxxxxxxx> wrote: >> >> On Tue, Mar 10, 2020 at 6:47 AM Ml Ml <mliebherr99@xxxxxxxxxxxxxx> wrote: >>> >>> Hello List, >>> >>> when i initially enable journal/mirror on an image it gets >>> bootstrapped to my site-b pretty quickly with 250MB/sec which is about >>> the IO Write limit. >>> >>> Once its up2date, the replay is very slow. About 15KB/sec and the >>> entries_behind_maste is just running away: >>> >>> root@ceph01:~# rbd --cluster backup mirror pool status rbd-cluster6 --verbose >>> health: OK >>> images: 3 total >>> 3 replaying >>> >>> ... >>> >>> vm-112-disk-0: >>> global_id: 60a795c3-9f5d-4be3-b9bd-3df971e531fa >>> state: up+replaying >>> description: replaying, master_position=[object_number=623, >>> tag_tid=3, entry_tid=345567], mirror_position=[object_number=35, >>> tag_tid=3, entry_tid=18371], entries_behind_master=327196 >>> last_update: 2020-03-10 11:36:44 >>> >>> ... >>> >>> Write traffic on the source is about 20/25MB/sec. >>> >>> On the Source i run 14.2.6 and on the destination 12.2.13. >>> >>> Any idea why the replaying is sooo slow? >> >> What is the latency between the two clusters? >> >> I would recommend increasing the "rbd_mirror_journal_max_fetch_bytes" >> config setting (defaults to 32KiB) on your destination cluster. i.e. >> try adding add "rbd_mirror_journal_max_fetch_bytes = 4194304" to the >> "[client]" section of your Ceph configuration file on the node where >> "rbd-mirror" daemon is running, and restart it. It defaults to a very >> small read size from the remote cluster in a primitive attempt to >> reduce the potential memory usage of the rbd-mirror daemon, but it has >> the side-effect of slowing down mirroring for links with higher >> latencies. >> >>> >>> Thanks, >>> Michael >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@xxxxxxx >>> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>> >> >> >> -- >> Jason >> > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx