Luminous and mimic: adding OSD can crash mon(s) and lead to loss of quorum

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everyone,

there are a couple of bug reports about this in Redmine but only one
(unanswered) mailing list message[1] that I could find. So I figured I'd
raise the issue here again and copy the original reporters of the bugs
(they are BCC'd, because in case they are no longer subscribed it
wouldn't be appropriate to share their email addresses with the list).

This is about https://tracker.ceph.com/issues/40029, and
https://tracker.ceph.com/issues/39978 (the latter of which was recently
closed as a duplicate of the former).

In short, it appears that at least in luminous and mimic (I haven't
tried nautilus yet), it's possible to crash a mon when attempting to add
a new OSD as it's trying to inject itself into the crush map under its
host bucket, when that host bucket does not exist yet.

What's worse is that when the OSD's "ceph osd new" process has thus
crashed the leader mon, a new leader is elected and in case the "ceph
osd new" process is still running on the OSD node, it will promptly
connect to that mon, and kill it too. This then continues until
sufficiently many mons have died for quorum to be lost.

The recovery steps appear to involve

- killing the "ceph osd new" process,
- restarting mons until you regain quorum,
- and then running "ceph osd purge" to drop the problematic OSD entry
from the crushmap and osdmap.

The issue can apparently be worked around by adding the host buckets to
the crushmap manually before adding the new OSDs, but surely this isn't
intended to be a prerequisite, at least not to the point of mons
crashing otherwise?

Also I am guessing that this is some weird corner case rooted in an
unusual combination of contributing factors, because otherwise I am
guessing more people would be bitten by this problem.

Anyone able to share their thoughts on this one? Have more people run
into this?

Cheers,
Florian



[1]
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-May/034880.html
— interestingly I could find this message in the pipermail archive but
none in the one that my MUA keeps for me. So perhaps that message wasn't
delivered to all subscribers, which might be why it has gone unanswered.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux