Hi Dan,
Thanks for that - it's exactly the setting we needed :)
Have a good weekend,
Jake
On 2/18/22 10:37, Dan van der Ster wrote:
Hi,
Yes, this is the option you're looking for:
https://docs.ceph.com/en/latest/rados/configuration/mon-osd-interaction/#confval-mon_osd_down_out_subtree_limit
<https://docs.ceph.com/en/latest/rados/configuration/mon-osd-interaction/#confval-mon_osd_down_out_subtree_limit>
The default is rack -- you want to set that to "host".
Cheers, Dan
On Fri., Feb. 18, 2022, 11:23 Jake Grimmett, <jog@xxxxxxxxxxxxxxxxx
<mailto:jog@xxxxxxxxxxxxxxxxx>> wrote:
Dear All,
Does ceph have any mechanism to automatically pause the cluster, and
stop recovery if one node, or more than a set number of OSDs fail?
The reason for asking, is that last night, one of the 20 OSD nodes on
our backup cluster crashed.
Ceph (of course) started recovering "lost data", so when we rebooted
the
failed node at 9am ~3% of the data on the cluster was misplaced.
It's going to take several days for the cluster to re-balance, during
which we are going to have little I/O capacity for running backups,
even
if I reduce the recovery priority.
We can look at turning the watchdog on, giving nagios an action, etc,
but I'd rather use any tools that ceph has built in.
BTW, this is an Octopus cluster 15.2.15, 580 x OSDs, using EC 8+2
best regards,
Jake
--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Francis Crick Avenue,
Cambridge CB2 0QH, UK.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
<mailto:ceph-users@xxxxxxx>
To unsubscribe send an email to ceph-users-leave@xxxxxxx
<mailto:ceph-users-leave@xxxxxxx>
For help, read https://www.mrc-lmb.cam.ac.uk/scicomp/
then contact unixadmin@xxxxxxxxxxxxxxxxx
--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Francis Crick Avenue,
Cambridge CB2 0QH, UK.
Phone 01223 267019
Mobile 0776 9886539
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx