OK I just wanted to confirm you hadn't extended the osd_heartbeat_grace or similar. On your large cluster, what is the time from stopping an osd (with fasst shutdown enabled) to: cluster [DBG] osd.317 reported immediately failed by osd.202 -- dan On Thu, Aug 13, 2020 at 4:38 PM Manuel Lausch <manuel.lausch@xxxxxxxx> wrote: > > Hi Dan, > > The only settings in my ceph.conf related to down/out and peering are > this. > > mon osd down out interval = 1800 > mon osd down out subtree limit = host > mon osd min down reporters = 3 > mon osd reporter subtree level = host > > > The Cluster has 44 Hosts á 24 OSDs > > > Manuel > > > On Thu, 13 Aug 2020 16:17:46 +0200 > Dan van der Ster <dan@xxxxxxxxxxxxxx> wrote: > > > Hi Manuel, > > > > Just to clarify -- do you override any of the settings related to peer > > down detection? heartbeat periods or timeouts or min down reporters > > or anything like that? > > > > Cheers, Dan > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx