For luminous you should check the corresponding _ssd-config values for osd_recovery_sleep and osd_max_backfills. However, I don't think you should see a problem with the defaults with luminous. In fact, I had good experience with making recovery even more aggressive than the defaults. You might want to look through the logs if there are other problems, for example, with peering taking very long or other OSDs being marked as down temporarily (the classic "a monitor marked me down but I'm still running"). Could be network or CPU bottlenecks. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: huxiaoyu@xxxxxxxxxxxx <huxiaoyu@xxxxxxxxxxxx> Sent: 25 August 2021 21:46:57 To: ceph-users Subject: How to slow down PG recovery when a failed OSD node come back? Dear Cepher, I had an all flash 3 node Ceph cluster, each node of 8 SSDs as OSDs, running Ceph release 12.2.13. I have the following setting osd_op_queue = wpq osd_op_queue_cut_off = high and osd_recovery_sleep= 0.5 osd_min_pg_log_entries = 3000 osd_max_pg_log_entries = 10000 osd_max_backfills = 1 The problem i encountered is the following: After a failed OSD node come back and re-join, there is 3-5 mimutes period during which the recovery workload overwhelming the system, making user IO almost stall. After this 3-5 mimutes, the recovery process seems to calm down and slow down to a reasonable level, give priority to user IO workload. What happens during the crazy 3-5 minutes? and how to reduce the negative impact then? any suggestions and comments are highly appreciated, best regards, Samuel huxiaoyu@xxxxxxxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx