OSD scrub during recovery

Reed Dier <reed.dier@xxxxxxxxxxx> · Tue, 30 May 2017 10:37:03 -0500

Lost an OSD and having to rebuild it.
8TB drive, so it has to backfill a ton of data.
Been taking a while, so looked at ceph -s and noticed that deep/scrubs were running even though I’m running newest Jewel (10.2.7) and OSD’s have the osd_scrub_during_recovery set to false.

$ cat /etc/ceph/ceph.conf | grep scrub | grep recovery
osd_scrub_during_recovery = false

$ sudo ceph daemon osd.0 config show | grep scrub | grep recovery
    "osd_scrub_during_recovery": "false”,

$ ceph --version
ceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185)

    cluster edeb727e-c6d3-4347-bfbb-b9ce7f60514b
     health HEALTH_WARN
            133 pgs backfill_wait
            10 pgs backfilling
            143 pgs degraded
            143 pgs stuck degraded
            143 pgs stuck unclean
            143 pgs stuck undersized
            143 pgs undersized
            recovery 22081436/1672287847 objects degraded (1.320%)
            recovery 20054800/1672287847 objects misplaced (1.199%)
            noout flag(s) set
     monmap e1: 3 mons at {core=10.0.1.249:6789/0,db=10.0.1.251:6789/0,dev=10.0.1.250:6789/0}
            election epoch 4234, quorum 0,1,2 core,dev,db
      fsmap e5013: 1/1/1 up {0=core=up:active}, 1 up:standby
     osdmap e27892: 54 osds: 54 up, 54 in; 143 remapped pgs
            flags noout,nodeep-scrub,sortbitwise,require_jewel_osds
      pgmap v13840713: 4292 pgs, 6 pools, 59004 GB data, 564 Mobjects
            159 TB used, 69000 GB / 226 TB avail
            22081436/1672287847 objects degraded (1.320%)
            20054800/1672287847 objects misplaced (1.199%)
                4143 active+clean
                 133 active+undersized+degraded+remapped+wait_backfill
                  10 active+undersized+degraded+remapped+backfilling
                   6 active+clean+scrubbing+deep
recovery io 21855 kB/s, 346 objects/s
  client io 30021 kB/s rd, 1275 kB/s wr, 291 op/s rd, 62 op/s wr

Looking at the ceph documentation for ‘master'

osd scrub during recovery

Description:	Allow scrub during recovery. Setting this to false will disable scheduling new scrub (and deep–scrub) while there is active recovery. Already running scrubs will be continued. This might be useful to reduce load on busy clusters.
Type:	Boolean
Default:	true

Are backfills not treated as recovery operations? Is it only preventing scrubs on the OSD’s that are actively recovering/backfilling?

Just curious as to why the feature did not seem to kick in as expected.

Thanks,

Reed
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com