Re: Ideas to reuse filesystem's checksum to enhance dm-raid1/10/5/6?

Qu Wenruo <quwenruo.btrfs@xxxxxxx> · Thu, 16 Nov 2017 15:38:05 +0800

On 2017年11月16日 14:54, Nikolay Borisov wrote:
> 
> 
> On 16.11.2017 04:18, Qu Wenruo wrote:
>> Hi all,
>>
>> [Background]
>> Recently I'm considering the possibility to use checksum from filesystem
>> to enhance device-mapper raid.
>>
>> The idea behind it is quite simple, since most modern filesystems have
>> checksum for their metadata, and even some (btrfs) have checksum for data.
>>
>> And for btrfs RAID1/10 (just ignore the RAID5/6 for now), at read time
>> it can use the checksum to determine which copy is correct so it can
>> return the correct data even one copy get corrupted.
>>
>> [Objective]
>> The final objective is to allow device mapper to do the checksum
>> verification (and repair if possible).
>>
>> If only for verification, it's not much different from current endio
>> hook method used by most of the fs.
>> However if we can move the repair part from filesystem (well, only btrfs
>> supports it yet), it would benefit all fs.
>>
>> [What we have]
>> The nearest infrastructure I found in kernel is bio_integrity_payload.
>>
>> However I found it's bounded to device, as it's designed to support
>> SCSI/SATA integrity protocol.
>> While for such use case, it's more bounded to filesystem, as fs (or
>> higher layer dm device) is the source of integrity data, and device
>> (dm-raid) only do the verification and possible repair.
>>
>> I'm not sure if this is a good idea to reuse or abuse
>> bio_integrity_payload for this purpose.
>>
>> Should we use some new infrastructure or enhance existing
>> bio_integrity_payload?
>>
>> (Or is this a valid idea or just another crazy dream?)
>>
> 
> This sounds good in principle, however I think there is one crucial
> point which needs to be considered:
> 
> All fs with checksums store those checksums in some specific way, then
> when they fetch data from disk they they also know how to acquire the
> respective checksum.

Just like integrity payload, we generate READ bio attached with checksum
hook function and checksum data.

So for data read, we read checksum first and attach it to data READ bio,
then submit it.

And for metadata read, in most case the checksum is integrated into
metadata header, like what we did in btrfs.

In that case we attach empty checksum data to bio, but use metadata
specific function hook to handle it.

> What you suggest might be doable but it will
> require lower layers (dm) be aware of how to acquire the specific
> checksum for some data.

In above case, dm only needs to call the verification hook function.
If verification passed, that's good.
If not, try other copy if we have.

In this case, I don't think dm layer needs any extra interface to
communicate with higher layer.

Thanks,
Qu

> I don't think at this point there is such infra
> and frankly I cannot even envision how it will work elegantly. Sure you
> can create a dm-checksum target (which I believe dm-verity is very
> similar to) that stores checksums alongside data but at this point the
> fs is really out of the picture.
> 
> 
>> Thanks,
>> Qu
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Attachment:
signature.asc

Description: OpenPGP digital signature