Re: RBD-mirror instabilities

Oliver Freyermuth <freyermuth@xxxxxxxxxxxxxxxxxx> · Wed, 12 Feb 2020 17:54:28 +0100

Dear Cephalopodians,

for those on the list also fighting rbd mirror process instabilities: With 14.2.7 (but maybe it was also present before, it does not happen often),
I very rarely encounter a case where none of the two described hacks I use is working anymore, since "ceph daemon /var/run/ceph/ceph-client.rbd_mirror...." just hangs forever.

So I am now using the following three cronjobs:

1) Restart RBD mirror if any images enter unknown or stopped state. This is probably not needed since (2) should catch everything:
----------------------
rbd --id=${rbd_id} mirror pool status | egrep -q 'unknown|stopped' && systemctl -q is-active ceph-rbd-mirror@${rbd_id}.service && systemctl restart ceph-rbd-mirror@${rbd_id}.service
----------------------

2) Restart RBD mirror if the RBD client is not replaying anything anymore (happens randomly for us due to client blacklisting when many OSDs restart, and it never recovers from that situation):
----------------------
ceph daemon /var/run/ceph/ceph-client.${rbd_id}.\$(systemctl show --property MainPID ceph-rbd-mirror@${rbd_id}.service | sed 's/MainPID=//').*.asok rbd mirror status | grep -q Replaying || (systemctl -q is-active ceph-rbd-mirror@${rbd_id}.service && systemctl restart ceph-rbd-mirror@${rbd_id}.service)
----------------------

3) Restart RBD mirror if control socket becomes unresponsive. This catches the cases where (2) just hangs forever, and (1) does not fire yet because at least one functional RBD mirror is left (we have 3):
----------------------
timeout 10 ceph daemon /var/run/ceph/ceph-client.${rbd_id}.\$(systemctl show --property MainPID ceph-rbd-mirror@${rbd_id}.service | sed 's/MainPID=//').*.asok help || [ $? -eq 124 ] && (systemctl -q is-active ceph-rbd-mirror@${rbd_id}.service && systemctl restart ceph-rbd-mirror@${rbd_id}.service)
----------------------

With this, we are running quite stable — with 3 RBD mirrors, there never is a real outage (and in any case, all issues seem to clearly correlate to restarts or short "hangs" of many OSD processes).

Cheers and hope this helps somebody with similar issues,
	Oliver

Am 27.12.19 um 02:43 schrieb Oliver Freyermuth:
Dear Cephalopodians,

for those following along through the holiday season, here's my "quick hack" for now, since our rbd mirrors keep going into "blacklisted" state whenever a bunch of OSDs restart in the cluster.

For those not following along, nice holidays to you and hopefully some calm days off :-).

To re-summarize: Once our rbd-mirrors are in that "blacklisted" state, they don't recover by themselves, so I think what is missing would be an auto-restart / reconnect after blacklisting
(and, of course, an idea why the daemons' clients get blacklisted when OSDs restart?). Let me know if I should open a tracker issue on that one,
or can provide more information (it happens every few nights for us).

Since I was looking to restart them only in case of failure, I came up with some lengthy commands.

I now have two cronjobs on the rbd-mirror daemon nodes set up, one which works "whatever happens", restarting an rbd mirror if any image sync is broken:

  rbd --id=rbd_mirror_backup mirror pool status | egrep -q 'unknown|stopped' && systemctl -q is-active ceph-rbd-mirror@rbd_mirror_backup.service && systemctl restart ceph-rbd-mirror@rbd_mirror_backup.service

I run this hourly. With multiple rbd mirrors, this does not catch everything, though. If we enter the failure state (blacklisted rbd mirror clients), this part only ensures at least one client recovers
and takes over the full load. To get the other clients to restart only if they are also blacklisted, I do:

  ceph daemon /var/run/ceph/ceph-client.rbd_mirror_backup.$(systemctl show --property MainPID ceph-rbd-mirror@rbd_mirror_backup.service | sed 's/MainPID=//').*.asok rbd mirror status | grep -q Replaying || (systemctl -q is-active ceph-rbd-mirror@rbd_mirror_backup.service && systemctl restart ceph-rbd-mirror@rbd_mirror_backup.service)

This also runs hourly and queries the daemon state itself. If there's no image in "Replaying" state, something is wrong and the daemon is restarted.
Technically, the latter cronjob should be sufficient, the first one is only there in case the daemons go completely awry (but I did not observe this up to now).

I made two interesting observations, though:
- It seems the log of the rbd-mirror is sometimes not filled with errors at all. The cause seems to be that the "rbd-mirror" processes are not SIGHUPed in the logrotate rule shipped with ceph-base.
   I created a tracker issue here:
    https://tracker.ceph.com/issues/43428
- The output of the "rbd mirror status" command is not valid JSON, it is missing the trailing brace.
   I created a tracker issue here:
    https://tracker.ceph.com/issues/43429

Cheers,
	Oliver

Am 24.12.19 um 04:39 schrieb Oliver Freyermuth:
Dear Cephalopodians,

running 13.2.6 on the source cluster and 14.2.5 on the rbd mirror nodes and the target cluster,
I observe regular failures of rbd-mirror processes.

With failures, I mean that traffic stops, but the daemons are still listed as active rbd-mirror daemons in
"ceph -s", and the daemons are still running. This comes in sync with a hefty load of below messages in the mirror logs.

This happens "sometimes" when some OSDs go down and up in the target cluster (which happens each night since the disks in that cluster
shortly go offline during "online" smart self-tests - that's a problem in itself, but it's a cluster built from hardware that would have been trashed otherwise).

The rbd daemons keep running in any case, but synchronization stops. If not all rbd mirror daemons have failed (we have three running, and it usually does not hit all of them),
the "surviving" seem(s) not to take care of the images the other daemons had locked.

Right now, I am eyeing with a "quick solution" of regularly restarting the rbd-mirror daemons, but if there are any good ideas on which debug info I could collect
to get this analyzed and fixed, that would of course be appreciated :-).

Cheers,
	Oliver

-----------------------------------------------
2019-12-24 02:08:51.379 7f31c530e700 -1 rbd::mirror::ImageReplayer: 0x559dcb968d00 [2/aabba863-89fd-4ea5-bb8c-0f417225d394] handle_process_entry_safe: failed to commit journal event: (108) Cannot send after transport endpoint shutdown
2019-12-24 02:08:51.379 7f31c530e700 -1 rbd::mirror::ImageReplayer: 0x559dcb968d00 [2/aabba863-89fd-4ea5-bb8c-0f417225d394] handle_replay_complete: replay encountered an error: (108) Cannot send after transport endpoint shutdown
...
2019-12-24 02:08:54.392 7f31c530e700 -1 rbd::mirror::ImageReplayer: 0x559dcb87bb00 [2/23699357-a611-4557-9d73-6ff5279da991] handle_process_entry_safe: failed to commit journal event: (125) Operation canceled
2019-12-24 02:08:54.392 7f31c530e700 -1 rbd::mirror::ImageReplayer: 0x559dcb87bb00 [2/23699357-a611-4557-9d73-6ff5279da991] handle_replay_complete: replay encountered an error: (125) Operation canceled
2019-12-24 02:08:55.707 7f31ea358700 -1 rbd::mirror::image_replayer::GetMirrorImageIdRequest: 0x559dce2e05b0 handle_get_image_id: failed to retrieve image id: (108) Cannot send after transport endpoint shutdown
2019-12-24 02:08:55.707 7f31ea358700 -1 rbd::mirror::image_replayer::GetMirrorImageIdRequest: 0x559dcf47ee70 handle_get_image_id: failed to retrieve image id: (108) Cannot send after transport endpoint shutdown
...
2019-12-24 02:08:55.716 7f31f5b6f700 -1 rbd::mirror::ImageReplayer: 0x559dcb997680 [2/f8218221-6608-4a2b-8831-84ca0c2cb418] operator(): start failed: (108) Cannot send after transport endpoint shutdown
2019-12-24 02:09:25.707 7f31f5b6f700 -1 rbd::mirror::InstanceReplayer: 0x559dcabd5b80 start_image_replayer: global_image_id=0577bd16-acc4-4e9a-81f0-c698a24f8771: blacklisted detected during image replay
2019-12-24 02:09:25.707 7f31f5b6f700 -1 rbd::mirror::InstanceReplayer: 0x559dcabd5b80 start_image_replayer: global_image_id=05bd4cca-a561-4a5c-ad83-9905ad5ce34e: blacklisted detected during image replay
2019-12-24 02:09:25.707 7f31f5b6f700 -1 rbd::mirror::InstanceReplayer: 0x559dcabd5b80 start_image_replayer: global_image_id=0e614ece-65b1-4b4a-99bd-44dd6235eb70: blacklisted detected during image replay
-----------------------------------------------

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

Attachment:
smime.p7s

Description: S/MIME Cryptographic Signature
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx