Hi Mehmet, I think this is expected, if you read the help: # ceph config help osd_scrub_interval_randomize_ratio osd_scrub_interval_randomize_ratio - Ratio of scrub interval to randomly vary (float, advanced) Default: 0.500000 Can update at runtime: true See also: [osd_scrub_min_interval] This prevents a scrub 'stampede' by randomly varying the scrub intervals so that they are soon uniformly distributed over the week # ceph config help osd_deep_scrub_randomize_ratio osd_deep_scrub_randomize_ratio - Scrubs will randomly become deep scrubs at this rate (0.15 -> 15% of scrubs are deep) (float, advanced) Default: 0.150000 Can update at runtime: true This prevents a deep scrub 'stampede' by spreading deep scrubs so they are uniformly distributed over the week So “osd_deep_scrub_interval” only means deep scrubbing each PG **at least** once per 2419200s Weiwen Hu 发件人: Mehmet<mailto:ceph@xxxxxxxxxx> 发送时间: 2021年10月22日 19:32 收件人: ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx> 主题: deep-scrubs not respecting scrub interval (ceph luminous) Hello, i have a strange issue and hope you can enlighten me why this happens and how i can prevent this. In ceph.conf i have: ... .. . [osd] osd_deep_scrub_interval = 2419200.000000 osd_scrub_max_interval = 2419200.000000 osd_scrub_begin_hour = 10 <= this works, great osd_scrub_end_hour = 17 <= this works, great But as you can see below, this seems to be not respected from ceph: # zgrep -i "deep-scrub ok" ceph-osd.285* (logfile) 2021-10-21 14:31:12.180689 7facaf28b700 0 log_channel(cluster) log [DBG] : *1.752* deep-scrub ok 2021-10-20 13:17:31.502227 7facb0a8e700 0 log_channel(cluster) log [DBG] : 1.51 deep-scrub ok 2021-10-17 13:45:46.243041 7facafa8c700 0 log_channel(cluster) log [DBG] : 1.4c2 deep-scrub ok 2021-10-17 17:25:55.570801 7facb028d700 0 log_channel(cluster) log [DBG] : 1.81d deep-scrub ok 2021-10-16 11:36:58.695621 7facaf28b700 0 log_channel(cluster) log [DBG] : *1.752* deep-scrub ok 2021-10-16 16:11:50.399225 7facb0a8e700 0 log_channel(cluster) log [DBG] : 1.51 deep-scrub ok i.e. *deep-scrub" on PG "1.752" (also same issue on i.e. "1.51") is done - 2021-10-21 14:31:12 - 2021-10-16 11:36:58 there are only 5 days inbetween, if i understand this correct ceph should wait approx. 4 Weeks (2419200 Seconds) before another deepscrub of one PG has to be happen. The cluster is in an "Health_OK" state (sometimes in warm because of slow requests) and i have checked that the config is in effekt on the said OSD in this example (osd.285): # ceph daemon osd.285 config show | grep "interval" | grep scrub "mon_scrub_interval": "86400", "osd_deep_scrub_interval": "2419200.000000", "osd_scrub_interval_randomize_ratio": "0.500000", "osd_scrub_max_interval": "2419200.000000", "osd_scrub_min_interval": "86400.000000", Does anyone know why this happens? Hope you guys can help me to understand this. - Mehmet Further information - ALL *HDD* - OSDs distributed over 17 Nodes # ceph -s cluster: id: 5d5095e2-e2c7-4790-a14c-86412d98d2dc health: HEALTH_WARN 435 slow requests are blocked > 32 sec. Implicated osds 84 services: mon: 3 daemons, quorum cmon01,cmon02,cmon03 mgr: cmon01(active), standbys: cmon03, cmon02 osd: 312 osds: 312 up, 312 in data: pools: 2 pools, 3936 pgs objects: 112M objects, 450 TB usage: 1352 TB used, 1201 TB / 2553 TB avail pgs: 3897 active+clean 38 active+clean+scrubbing+deep 1 active+clean+scrubbing io: client: 120 MB/s rd, 124 MB/s wr, 820 op/s rd, 253 op/s wr _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx