Re: rbd-mirror stops replaying journal on primary cluster

Josef Johansson <josef86@xxxxxxxxx> · Tue, 10 Jan 2023 11:46:26 +0100

Hi,

Actually, the test case was even more simple than that. A misaligned
discard (discard_granularity_bytes=4096, offset=0, length=4096+512)
made the journal stop replaying entries. This is now well covered in
tests and example e2e-tests.

The workaround is quite easy, set `rbd_discard_granularity_bytes = 0`
in the client ceph conf and all discards will be applied to the rbd
image. The fix should hopefully be backported to stable releases.

Thanks for review by the ceph-team on this.

If anyone could confirm that this indeed solves the problem, please
let me know anyhow.

Regards
Josef

On Thu, Dec 8, 2022 at 11:15 AM Josef Johansson <josef86@xxxxxxxxx> wrote:
>
> Hi,
>
> Running a simple
> `echo 1>a;sync;rm a;sync;fstrim --all`
> Triggers the problem. No need to have the mount point mounted with discard.
>
> On Thu, Dec 8, 2022 at 12:33 AM Josef Johansson <josef86@xxxxxxxxx> wrote:
> >
> > Hi,
> >
> > I've updated https://tracker.ceph.com/issues/57396 with some more
> > info, it seems that disabling discard within a guest solves the
> > problem (or switching from virtio-scsi-single to virtio-blk in older
> > kernels). I'm testing two different VMs on the same hypervisor with
> > identical configs, one works the other doesn't.
> >
> > Not sure what to make of it, seems that the kernel around 4.18+ are
> > sending a weird discard?
> >
> > On Tue, Aug 30, 2022 at 8:43 AM Josef Johansson <josef86@xxxxxxxxx> wrote:
> > >
> > > Hi,
> > >
> > > There's nothing special in the cluster when it stops replaying. It
> > > seems that a journal entry that the local replayer doesn't handle and
> > > just stops. Since it's the local replayer that stops there's no logs
> > > in rbd-mirror. The odd part is that rbd-mirror handles this totally
> > > fine and is the one syncing correctly.
> > >
> > > What's worse is that this is reported as HEALTHY in status
> > > information, even though when restarting that VM it will stall until
> > > replaying is complete. The replay function inside rbd client seems to
> > > be fine handling the journal, but only on start of the vm. I will try
> > > to get a ticket open on tracker.ceph.com as soon as my account is
> > > approved.
> > >
> > > I have tried to see what component is responsible for local replay but
> > > I have not been successful yet.
> > >
> > > Thanks for answering :)
> > >
> > > On Mon, Aug 22, 2022 at 11:05 AM Eugen Block <eblock@xxxxxx> wrote:
> > > >
> > > > Hi,
> > > >
> > > > IIRC the rbd mirror journals will grow if the sync stops to work,
> > > > which seems to be the case here. Does the primary cluster experience
> > > > any high load when the replay stops? How is the connection between the
> > > > two sites and is the link saturated? Does the rbd-mirror log reveal
> > > > anything useful (maybe also in debug mode)?
> > > >
> > > > Regards,
> > > > Eugen
> > > >
> > > > Zitat von Josef Johansson <josef@xxxxxxxxxxx>:
> > > >
> > > > > Hi,
> > > > >
> > > > > I'm running ceph octopus 15.2.16 and I'm trying out two way mirroring.
> > > > >
> > > > > Everything seems to running fine except sometimes when the replay
> > > > > stops at the primary clusters.
> > > > >
> > > > > This means that VMs will not start properly until all journal
> > > > > entries are replayed, but also that the journal grows by time.
> > > > >
> > > > > I am trying to find out why this occurs, and where to look for more
> > > > > information.
> > > > >
> > > > > I am currently using rbd --pool <pool> --image <image> journal
> > > > > status to see if the clients are in sync or not.
> > > > >
> > > > > Example output when things went sideways
> > > > >
> > > > > minimum_set: 0
> > > > > active_set: 2
> > > > > registered clients:
> > > > > [id=, commit_position=[positions=[[object_number=0, tag_tid=1,
> > > > > entry_tid=4592], [object_number=3, tag_tid=1, entry_tid=4591],
> > > > > [object_number=2, tag_tid=1, entry_tid=4590], [object_number=1,
> > > > > tag_tid=1, entry_tid=4589]]], state=connected]
> > > > > [id=bdde9b90-df26-4e3d-84b3-66605dc45608,
> > > > > commit_position=[positions=[[object_number=5, tag_tid=1,
> > > > > entry_tid=19913], [object_number=4, tag_tid=1, entry_tid=19912],
> > > > > [object_number=7, tag_tid=1, entry_tid=19911], [object_number=6,
> > > > > tag_tid=1, entry_tid=19910]]], state=disconnected]
> > > > >
> > > > > Right now I'm trying to catch it red handed in the primary osd logs.
> > > > > But I'm not even sure if that's the process that is replaying the
> > > > > journal..
> > > > >
> > > > > Regards
> > > > > Josef
> > > > > _______________________________________________
> > > > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > ceph-users mailing list -- ceph-users@xxxxxxx
> > > > To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx