With “osd max scrubs” set to 1 in ceph.conf, which I believe is also the default, at almost all times, there are 2-3 deep scrubs running. 3 simultaneous deep scrubs is enough to cause a constant stream of: mon.ceph1 [WRN] Health check update: 69 slow requests are blocked > 32 sec (REQUEST_SLOW) This seems to correspond with all three deep scrubs hitting the same OSD at the same time, starving out all other I/O requests for that OSD. But it can happen less frequently and less severely with two or even one deep scrub running. Nonetheless, consumers of the cluster are not thrilled with regular instances of 30-60 second disk I/Os. The cluster is five nodes, 15 OSDs, and there is one pool with 512 placement groups. The cluster is running: ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc) All of the OSDs are bluestore, with HDD storage and SSD block.db. Even setting “osd deep scrub interval = 1843200” hasn’t resolved this issue, though it seems to get the number down from 3 to 2, which at least cuts down on the frequency of requests stalling out. With 512 pgs, that should mean that one pg gets deep-scrubbed per hour, and it seems like a deep-scrub takes about 20 minutes. So what should be happening is that 1/3rd of the time there should be one deep scrub, and 2/3rds of the time there shouldn’t be any. Yet instead we have 2-3 deep scrubs running at all times. Looking at “ceph pg dump” shows that about 7 deep scrubs get launched per hour: $sudo ceph pg dump | fgrep active | awk ‘{print$23” “$24" "$1}' | fgrep 2017-09-26 | sort -rn | head -22 dumped all 2017-09-26 16:42:46.781761 0.181 2017-09-26 16:41:40.056816 0.59 2017-09-26 16:39:26.216566 0.9e 2017-09-26 16:26:43.379806 0.19f 2017-09-26 16:24:16.321075 0.60 2017-09-26 16:08:36.095040 0.134 2017-09-26 16:03:33.478330 0.b5 2017-09-26 15:55:14.205885 0.1e2 2017-09-26 15:54:31.413481 0.98 2017-09-26 15:45:58.329782 0.71 2017-09-26 15:34:51.777681 0.1e5 2017-09-26 15:32:49.669298 0.c7 2017-09-26 15:01:48.590645 0.1f 2017-09-26 15:01:00.082014 0.199 2017-09-26 14:45:52.893951 0.d9 2017-09-26 14:43:39.870689 0.140 2017-09-26 14:28:56.217892 0.fc 2017-09-26 14:28:49.665678 0.e3 2017-09-26 14:11:04.718698 0.1d6 2017-09-26 14:09:44.975028 0.72 2017-09-26 14:06:17.945012 0.8a 2017-09-26 13:54:44.199792 0.ec What’s going on here? Why isn’t the limit on scrubs being honored? It would also be great if scrub I/O were surfaced in “ceph status” the way recovery I/O is, especially since it can have such a significant impact on client operations. Thanks! _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com