Re: Deep scrubbing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 25, 2016 at 5:20 AM, Andrzej Jakowski
<andrzej.jakowski@xxxxxxxxx> wrote:
> 2016-10-24 19:27 GMT-07:00 kefu chai <tchaikov@xxxxxxxxx>:
>> posting this to ceph-users mailing list.
>>
>> On Tue, Oct 25, 2016 at 2:02 AM, Andrzej Jakowski
>> <andrzej.jakowski@xxxxxxxxx> wrote:
>>> Hi,
>>>
>>> Wanted to learn more on what is the Ceph community take on the deep
>>> scrubbing process.
>>> It seems that deep scrubbing is expected to read data from physical
>>> media: NAND dies or magnetic platters.
>>> What in case if OSD is build on top of some kind of volume and the
>>> logic in the volume manager prevents OSD scrubbing process to read the
>>> data from physical media?
>>
>> so it's a read only media, and read-only only for scrubbing. so you
>> have no choice but to disable the scrub, i guess?
>>
>> $ ceph osd set noscrub
>> $ ceph osd set nodeep-scrub
>
> No, let me rephrase this. We can imagine following situation: Logical
> volume manager
> implements some kind of caching. OSD is built on top of the cache
> volume. If deep scrubbing
> is done, data may not be read from primary storage but from cache due
> to cache hit.
> In case if data is corrupted in the primary storage deep scrubbing may
> not be detect it.
> Is there a way for OSD to force reading data from primary storage device?

I've had this same concern in relation to bcache/dm-cache/iCAS
accelerated OSDs. It seems that deep scrubbing would be useless if
you're reading the cached data and not the HDD itself. And to make
things worse, deep scrubbing will tend to thrash the cache and evict
all the hot data, making the cache itself of less value.

So in general, for these things to be useful we need a way to identify
and bypass the cache for deep scrub IOs. (and there are probably other
types of IO that we don't want to cache, such as reads for
backfilling).

I've worked a bit with iCAS, and it allows you to define a policy
whereby reads above a particular size bypass the cache. And AFAIK
bcache will bypass the cache for sequential IO. Maybe these help,
maybe not...
I've heard that RedHat is working on dm-cache tooling for OSDs --
maybe they already realized this problem and have a good solution.

Cheers, dan
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux