Re: Full cluster outage when ECONNREFUSED is triggered

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Dennis,

I have to ask a clarifying question. If I understand the intend of osd_fast_fail_on_connection_refused correctly, an OSD that receives a connection_refused should get marked down fast to avoid unnecessarily long wait times. And *only* OSDs that receive connection refused.

In your case, did booting up the server actually create a network route for all other OSDs to the wrong network as well? In other words, did it act as a gateway and all OSDs received connection refused messages and not just the ones on the critical host? If so, your observation would be expected. If not, then there is something wrong with the down reporting that should be looked at.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Frank Schilder <frans@xxxxxx>
Sent: Friday, November 24, 2023 1:20 PM
To: Denis Krienbühl; Burkhard Linke
Cc: ceph-users@xxxxxxx
Subject:  Re: Full cluster outage when ECONNREFUSED is triggered

Hi Denis.

>  The mon then propagates that failure, without taking any other reports into consideration:

Exactly. I cannot imagine that this change of behavior is intended. The configs on OSD down reporting ought to be honored in any failure situation. Since you already investigated the relevant code lines, please update/create the tracker with your findings. Hope a dev looks at this.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Denis Krienbühl <denis@xxxxxxx>
Sent: Friday, November 24, 2023 12:04 PM
To: Burkhard Linke
Cc: ceph-users@xxxxxxx
Subject:  Re: Full cluster outage when ECONNREFUSED is triggered


> On 24 Nov 2023, at 11:49, Burkhard Linke <Burkhard.Linke@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> This should not be case in the reported situation unless setting osd_fast_fail_on_connection_refused<https://docs.ceph.com/en/latest/rados/configuration/osd-config-ref/#confval-osd_fast_fail_on_connection_refused>=true changes this behaviour.


In our tests it does change the behavior. Usually the mons take mon_osd_reporter_subtree_level and mon_osd_min_down_reporters into account. In our tests, this is the case if an OSD heartbeat is dropped and the OSD is still able to talk to the mons.

However, if the OSD heartbeat is rejected, in our case because of an unrelated firewall change, the OSD sends an immediate failure to the mon:
https://github.com/ceph/ceph/blob/febfdd83a7838338033486826ef1fc9a5e8d588e/src/osd/OSD.cc#L6434;
ceph/src/osd/OSD.cc at febfdd83a7838338033486826ef1fc9a5e8d588e · ceph/ceph
github.com


The mon then propagates that failure, without taking any other reports into consideration:

https://github.com/ceph/ceph/blob/febfdd83a7838338033486826ef1fc9a5e8d588e/src/mon/OSDMonitor.cc#L3367;
ceph/src/mon/OSDMonitor.cc at febfdd83a7838338033486826ef1fc9a5e8d588e · ceph/ceph
github.com

This is fine when a single OSD goes down and everything else is okay. It then has the intended effect of getting rid of the OSD fast. The assumption presumably being: If a host can answer with a rejection to the OSD heartbeat, it is only the OSD that is affected.

In our case however, a network change caused rejections from an entirely different host (a gateway), while a network path to the mons was still available. In this case, Ceph does not apply the safe-guards it usually does.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux