Re: RADOS + deep scrubbing performance issues in production environment

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 1/27/2014 1:45 PM, Sage Weil wrote:
There is also

  ceph osd set noscrub

and then later

  ceph osd unset noscrub

In my experience scrub isn't nearly as much of a problem as deep-scrub. On a IOPS constrained cluster with writes approaching the available aggregate spindle performance minus replication penalty and possibly co-located osd journal penalty, scrub may run without any disruption. But deep-scrub tends to make iowait on the spindles get ugly.

To disable/enable deep-scrub use:

ceph osd set nodeep-scrub
ceph osd unset nodeep-scrub


I forget whether this pauses an in-progress PG scrub or just makes it stop
when it gets to the next PG boundary.

sage

On Mon, 27 Jan 2014, Kyle Bader wrote:

Are there any tools we are not aware of for controlling, possibly pausing,
deep-scrub and/or getting some progress about the procedure ?
Also since I believe it would be a bad practice to disable deep-scrubbing do you
have any recommendations of how to work around (or even solve) this issue ?

The periodicity of scrubs is controllable with these tunables:

osd scrub max interval
osd deep scrub interval

You may also be interested in adjusting:

osd scrub load threshold

More information on the docs page:

http://ceph.com/docs/master/rados/configuration/osd-config-ref/#scrubbing

I rarely run into a situation where 1m average of load is <0.5 on a multi-core server running osds, so deep scrub for me is always triggered by the 'osd scrub max interval'. I've had a bug out there to take core count into consideration:

http://tracker.ceph.com/issues/6296

The documentation used to say the "Default is 50%" implying that this feature should allow scrub to start with a much higher load than 0.5 will allow on multi-core systems. The documentation has changed, but the default of 0.5 is still artificially suppressing deep-scrub from opportunistically starting on relatively idle multi-core systems.

That being said, deep-scrub may be better served with an osd_scrub_iops_threshold mechanism instead of (or in addition to) the osd_scrub_load_threshold.

- Mike


Hope that helps some!

--

Kyle
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux