=?eucgb2312_cn?q?=BB=D8=B8=B4=3A_deep-scrubs_not_respecting_scrub_interval_=28ceph_luminous=29?=

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Mehmet,

I think this is expected, if you read the help:

# ceph config help osd_scrub_interval_randomize_ratio
osd_scrub_interval_randomize_ratio - Ratio of scrub interval to randomly vary
  (float, advanced)
  Default: 0.500000
  Can update at runtime: true
  See also: [osd_scrub_min_interval]

This prevents a scrub 'stampede' by randomly varying the scrub intervals so that they are soon uniformly distributed over the week

# ceph config help osd_deep_scrub_randomize_ratio
osd_deep_scrub_randomize_ratio - Scrubs will randomly become deep scrubs at this rate (0.15 -> 15% of scrubs are deep)
  (float, advanced)
  Default: 0.150000
  Can update at runtime: true

This prevents a deep scrub 'stampede' by spreading deep scrubs so they are uniformly distributed over the week

So “osd_deep_scrub_interval” only means deep scrubbing each PG **at least** once per 2419200s

Weiwen Hu

发件人: Mehmet<mailto:ceph@xxxxxxxxxx>
发送时间: 2021年10月22日 19:32
收件人: ceph-users@xxxxxxx<mailto:ceph-users@xxxxxxx>
主题:  deep-scrubs not respecting scrub interval (ceph luminous)

Hello,

i have a strange issue and hope you can enlighten me why this happens
and how i can prevent this.

In ceph.conf i have:
... .. .
[osd]
osd_deep_scrub_interval = 2419200.000000
osd_scrub_max_interval = 2419200.000000

osd_scrub_begin_hour = 10 <= this works, great
osd_scrub_end_hour = 17 <= this works, great

But as you can see below, this seems to be not respected from ceph:

# zgrep -i "deep-scrub ok" ceph-osd.285* (logfile)
2021-10-21 14:31:12.180689 7facaf28b700  0 log_channel(cluster) log
[DBG] : *1.752* deep-scrub ok
2021-10-20 13:17:31.502227 7facb0a8e700  0 log_channel(cluster) log
[DBG] : 1.51 deep-scrub ok
2021-10-17 13:45:46.243041 7facafa8c700  0 log_channel(cluster) log
[DBG] : 1.4c2 deep-scrub ok
2021-10-17 17:25:55.570801 7facb028d700  0 log_channel(cluster) log
[DBG] : 1.81d deep-scrub ok
2021-10-16 11:36:58.695621 7facaf28b700  0 log_channel(cluster) log
[DBG] : *1.752* deep-scrub ok
2021-10-16 16:11:50.399225 7facb0a8e700  0 log_channel(cluster) log
[DBG] : 1.51 deep-scrub ok

i.e. *deep-scrub" on PG "1.752" (also same issue on i.e. "1.51") is done
- 2021-10-21 14:31:12
- 2021-10-16 11:36:58
there are only 5 days inbetween, if i understand this correct ceph
should wait approx. 4 Weeks (2419200 Seconds) before another deepscrub
of one PG has to be happen.

The cluster is in an "Health_OK" state (sometimes in warm because of
slow requests) and i have checked that the config is in effekt on the
said OSD in this example (osd.285):

# ceph daemon osd.285 config show | grep "interval" | grep scrub
     "mon_scrub_interval": "86400",
     "osd_deep_scrub_interval": "2419200.000000",
     "osd_scrub_interval_randomize_ratio": "0.500000",
     "osd_scrub_max_interval": "2419200.000000",
     "osd_scrub_min_interval": "86400.000000",

Does anyone know why this happens?

Hope you guys can help me to understand this.
- Mehmet

Further information - ALL *HDD* - OSDs distributed over 17 Nodes
# ceph -s
   cluster:
     id:     5d5095e2-e2c7-4790-a14c-86412d98d2dc
     health: HEALTH_WARN
             435 slow requests are blocked > 32 sec. Implicated osds 84

   services:
     mon: 3 daemons, quorum cmon01,cmon02,cmon03
     mgr: cmon01(active), standbys: cmon03, cmon02
     osd: 312 osds: 312 up, 312 in

   data:
     pools:   2 pools, 3936 pgs
     objects: 112M objects, 450 TB
     usage:   1352 TB used, 1201 TB / 2553 TB avail
     pgs:     3897 active+clean
              38   active+clean+scrubbing+deep
              1    active+clean+scrubbing

   io:
     client:   120 MB/s rd, 124 MB/s wr, 820 op/s rd, 253 op/s wr
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux