Thanks for answering even before I asked the questions:)
So bottom line, HEALTH_ERR state is simply part of taking a (bunch of) OSD down? Is HEALTH_ERR period of 2-4 seconds within normal bounds? For context, CPUs are 2609v3 per 4 OSDs. (I know; they're far from the fastest CPUs)On Thu, Aug 3, 2017 at 1:55 PM, Hans van den Bogert <hansbogert@xxxxxxxxx> wrote:
What are the implications of this? Because I can see a lot of blocked requests piling up when using 'noout' and 'nodown'. That probably makes sense though.Another thing, no when the OSDs come back online, I again see multiple periods of HEALTH_ERR state. Is that to be expected?On Thu, Aug 3, 2017 at 1:36 PM, linghucongsong <linghucongsong@xxxxxxx> wrote:
set the osd noout nodown
At 2017-08-03 18:29:47, "Hans van den Bogert" <hansbogert@xxxxxxxxx> wrote:
HansThanks,Hi all,One thing which has bothered since the beginning of using ceph is that a reboot of a single OSD causes a HEALTH_ERR state for the cluster for at least a couple of seconds.In the case of planned reboot of a OSD node, should I do some extra commands in order not to go to HEALTH_ERR state?
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com