On Tue, Apr 9, 2019 at 11:08 AM Magnus Grönlund <magnus@xxxxxxxxxxx> wrote: > > >On Tue, Apr 9, 2019 at 10:40 AM Magnus Grönlund <magnus@xxxxxxxxxxx> wrote: > >> > >> Hi, > >> We have configured one-way replication of pools between a production cluster and a backup cluster. But unfortunately the rbd-mirror or the backup cluster is unable to keep up with the production cluster so the replication fails to reach replaying state. > > > >Hmm, it's odd that they don't at least reach the replaying state. Are > >they still performing the initial sync? > > There are three pools we try to mirror, (glance, cinder, and nova, no points for guessing what the cluster is used for :) ), > the glance and cinder pools are smaller and sees limited write activity, and the mirroring works, the nova pool which is the largest and has 90% of the write activity never leaves the "unknown" state. > > # rbd mirror pool status cinder > health: OK > images: 892 total > 890 replaying > 2 stopped > # > # rbd mirror pool status nova > health: WARNING > images: 2479 total > 2479 unknown > # > The production clsuter has 5k writes/s on average and the backup cluster has 1-2k writes/s on average. The production cluster is bigger and has better specs. I thought that the backup cluster would be able to keep up but it looks like I was wrong. The fact that they are in the unknown state just means that the remote "rbd-mirror" daemon hasn't started any journal replayers against the images. If it couldn't keep up, it would still report a status of "up+replaying". What Ceph release are you running on your backup cluster? > >> And the journals on the rbd volumes keep growing... > >> > >> Is it enough to simply disable the mirroring of the pool (rbd mirror pool disable <pool>) and that will remove the lagging reader from the journals and shrink them, or is there anything else that has to be done? > > > >You can either disable the journaling feature on the image(s) since > >there is no point to leave it on if you aren't using mirroring, or run > >"rbd mirror pool disable <pool>" to purge the journals. > > Thanks for the confirmation. > I will stop the mirror of the nova pool and try to figure out if there is anything we can do to get the backup cluster to keep up. > > >> Best regards > >> /Magnus > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@xxxxxxxxxxxxxx > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > >-- > >Jason -- Jason _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com